You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by "ayushtkn (via GitHub)" <gi...@apache.org> on 2023/05/19 07:47:05 UTC

[GitHub] [hive] ayushtkn commented on a diff in pull request #4337: HIVE-27358: Add test cases for HIVE-22816 (result caching with views)

ayushtkn commented on code in PR #4337:
URL: https://github.com/apache/hive/pull/4337#discussion_r1198653522


##########
ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out:
##########
@@ -682,3 +682,568 @@ POSTHOOK: Input: default@src
 POSTHOOK: Input: default@tab1_n1
 #### A masked pattern was here ####
 1028
+PREHOOK: query: create view join_count_transactional_view as select count(*) from tab1_n1 join tab2_n1 on (tab1_n1.key = tab2_n1.key)
+PREHOOK: type: CREATEVIEW
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+PREHOOK: Output: database:default
+PREHOOK: Output: default@join_count_transactional_view
+POSTHOOK: query: create view join_count_transactional_view as select count(*) from tab1_n1 join tab2_n1 on (tab1_n1.key = tab2_n1.key)
+POSTHOOK: type: CREATEVIEW
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@join_count_transactional_view
+PREHOOK: query: explain select * from join_count_transactional_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_transactional_view
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+POSTHOOK: query: explain select * from join_count_transactional_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_transactional_view
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Tez
+#### A masked pattern was here ####
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
+        Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: tab1_n1
+                  filterExpr: key is not null (type: boolean)
+                  properties:
+                    insideView TRUE
+                  Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                  Filter Operator
+                    predicate: key is not null (type: boolean)
+                    Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                    Select Operator
+                      expressions: key (type: string)
+                      outputColumnNames: _col0
+                      Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        null sort order: z
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+            Execution mode: vectorized, llap
+            LLAP IO: may be used (ACID table)
+        Map 4 
+            Map Operator Tree:
+                TableScan
+                  alias: tab2_n1
+                  filterExpr: key is not null (type: boolean)
+                  properties:
+                    insideView TRUE
+                  Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                  Filter Operator
+                    predicate: key is not null (type: boolean)
+                    Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                    Select Operator
+                      expressions: key (type: string)
+                      outputColumnNames: _col0
+                      Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        null sort order: z
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+            Execution mode: vectorized, llap
+            LLAP IO: may be used (ACID table)
+        Reducer 2 
+            Execution mode: llap
+            Reduce Operator Tree:
+              Merge Join Operator
+                condition map:
+                     Inner Join 0 to 1
+                keys:
+                  0 _col0 (type: string)
+                  1 _col0 (type: string)
+                Statistics: Num rows: 791 Data size: 6328 Basic stats: COMPLETE Column stats: COMPLETE
+                Group By Operator
+                  aggregations: count()
+                  minReductionHashAggr: 0.99
+                  mode: hash
+                  outputColumnNames: _col0
+                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                  Reduce Output Operator
+                    null sort order: 
+                    sort order: 
+                    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                    value expressions: _col0 (type: bigint)
+        Reducer 3 
+            Execution mode: vectorized, llap
+            Reduce Operator Tree:
+              Group By Operator
+                aggregations: count(VALUE._col0)
+                mode: mergepartial
+                outputColumnNames: _col0
+                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                File Output Operator
+                  compressed: false
+                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                  table:
+                      input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+                      output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: select * from join_count_transactional_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_transactional_view
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+POSTHOOK: query: select * from join_count_transactional_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_transactional_view
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+1028
+test.comment="View on transactional tables, should use cache"
+PREHOOK: query: explain select * from join_count_transactional_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_transactional_view
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+POSTHOOK: query: explain select * from join_count_transactional_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_transactional_view
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+STAGE DEPENDENCIES:
+  Stage-0 is a root stage
+
+STAGE PLANS:
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+      Cached Query Result: true
+
+PREHOOK: query: select * from join_count_transactional_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_transactional_view
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+POSTHOOK: query: select * from join_count_transactional_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_transactional_view
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+1028
+PREHOOK: query: insert into tab1_n1 select * from default.src limit 1
+PREHOOK: type: QUERY
+PREHOOK: Input: default@src
+PREHOOK: Output: default@tab1_n1
+POSTHOOK: query: insert into tab1_n1 select * from default.src limit 1
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@src
+POSTHOOK: Output: default@tab1_n1
+POSTHOOK: Lineage: tab1_n1.key SIMPLE [(src)src.FieldSchema(name:key, type:string, comment:default), ]
+POSTHOOK: Lineage: tab1_n1.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ]
+test.comment="Cache entry should be invalidated from prior insert, should not use cache"
+PREHOOK: query: explain select * from join_count_transactional_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_transactional_view
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+POSTHOOK: query: explain select * from join_count_transactional_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_transactional_view
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Input: default@tab2_n1
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Tez
+#### A masked pattern was here ####
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
+        Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: tab1_n1
+                  filterExpr: key is not null (type: boolean)
+                  properties:
+                    insideView TRUE
+                  Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                  Filter Operator
+                    predicate: key is not null (type: boolean)
+                    Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                    Select Operator
+                      expressions: key (type: string)
+                      outputColumnNames: _col0
+                      Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        null sort order: z
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+            Execution mode: vectorized, llap
+            LLAP IO: may be used (ACID table)
+        Map 4 
+            Map Operator Tree:
+                TableScan
+                  alias: tab2_n1
+                  filterExpr: key is not null (type: boolean)
+                  properties:
+                    insideView TRUE
+                  Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                  Filter Operator
+                    predicate: key is not null (type: boolean)
+                    Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                    Select Operator
+                      expressions: key (type: string)
+                      outputColumnNames: _col0
+                      Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        null sort order: z
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+            Execution mode: vectorized, llap
+            LLAP IO: may be used (ACID table)
+        Reducer 2 
+            Execution mode: llap
+            Reduce Operator Tree:
+              Merge Join Operator
+                condition map:
+                     Inner Join 0 to 1
+                keys:
+                  0 _col0 (type: string)
+                  1 _col0 (type: string)
+                Statistics: Num rows: 792 Data size: 6336 Basic stats: COMPLETE Column stats: COMPLETE
+                Group By Operator
+                  aggregations: count()
+                  minReductionHashAggr: 0.99
+                  mode: hash
+                  outputColumnNames: _col0
+                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                  Reduce Output Operator
+                    null sort order: 
+                    sort order: 
+                    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                    value expressions: _col0 (type: bigint)
+        Reducer 3 
+            Execution mode: vectorized, llap
+            Reduce Operator Tree:
+              Group By Operator
+                aggregations: count(VALUE._col0)
+                mode: mergepartial
+                outputColumnNames: _col0
+                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                File Output Operator
+                  compressed: false
+                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
+                  table:
+                      input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+                      output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: create view join_count_view as select count(*) from tab1_n1 join src on (tab1_n1.key = src.key)
+PREHOOK: type: CREATEVIEW
+PREHOOK: Input: default@src
+PREHOOK: Input: default@tab1_n1
+PREHOOK: Output: database:default
+PREHOOK: Output: default@join_count_view
+POSTHOOK: query: create view join_count_view as select count(*) from tab1_n1 join src on (tab1_n1.key = src.key)
+POSTHOOK: type: CREATEVIEW
+POSTHOOK: Input: default@src
+POSTHOOK: Input: default@tab1_n1
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@join_count_view
+POSTHOOK: Lineage: join_count_view._c0 EXPRESSION [(tab1_n1)tab1_n1.null, (src)src.null, ]
+PREHOOK: query: explain select * from join_count_view
+PREHOOK: type: QUERY
+PREHOOK: Input: default@join_count_view
+PREHOOK: Input: default@src
+PREHOOK: Input: default@tab1_n1
+#### A masked pattern was here ####
+POSTHOOK: query: explain select * from join_count_view
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@join_count_view
+POSTHOOK: Input: default@src
+POSTHOOK: Input: default@tab1_n1
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Tez
+#### A masked pattern was here ####
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
+        Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: tab1_n1
+                  filterExpr: key is not null (type: boolean)
+                  properties:
+                    insideView TRUE
+                  Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                  Filter Operator
+                    predicate: key is not null (type: boolean)
+                    Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                    Select Operator
+                      expressions: key (type: string)
+                      outputColumnNames: _col0
+                      Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        null sort order: z
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 501 Data size: 43587 Basic stats: COMPLETE Column stats: COMPLETE

Review Comment:
   we have recently sees data size causing a lot of flakiness? if not necessary can we explore masking the size values?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org