You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "Long, Andrew" <lo...@amazon.com> on 2016/06/30 20:11:53 UTC
Cartesian Joins are failing + Controlling the number of mappers in
Hive+TEZ
Hello everyone,
I’ve run into a situation where tez grossly deoptimizes cartisian mapjoins by not creating enough map jobs (as seen below). It seems as the nodes eventually go OOM and go unhealthy. Is there a way to force hive to increase the number of map tasks? I’ve tried a number of them but they don’t seem to have any effect.
Cheers Andrew
> SET hive.execution.engine = tez;
hive> SET hive.exec.reducers.bytes.per.reducer=8000000;
hive> SET mapreduce.input.fileinputformat.split.maxsize=8000000;
hive> SET hive.vectorized.execution.enabled=true;
hive> SET hive.stats.join.factor=1.3;
hive> SET hive.exec.reducers.max=4036;
hive> SET mapred.max.split.size;
mapred.max.split.size=8000000
hive> SET mapreduce.input.fileinputformat.split.maxsize 276480;
mapreduce.input.fileinputformat.split.maxsize 276480 is undefined
hive>
>
> CREATE TABLE intial_starting_balance AS
> SELECT
> D.caldate as accounting_date_local,
> RF.account_number as account_number,
> RF.aggregation_type,
> RF.application_name,
> RF.company_code,
> RF.cost_center,
> RF.dimension01,
> RF.dimension02,
> RF.dimension03,
> RF.dimension04,
> RF.dimension05,
> RF.dimension06,
> RF.dimension07,
> RF.dimension08,
> RF.dimension09,
> RF.dimension10,
> RF.dimension11,
> RF.dimension12,
> RF.dimension13,
> RF.dimension14,
> RF.dimension15,
> RF.financial_event_type,
> RF.functional_currency_code,
> RF.func_currency_amt_sum,
> RF.func_currency_balance,
> RF.func_currency_beg_balance,
> RF.func_journal_balance,
> RF.func_journal_beg_balance,
> RF.func_journal_sum,
> RF.gl_group_id,
> RF.gl_product_line,
> RF.jl_description,
> RF.ledger_id,
> RF.local_currency_amt_sum,
> RF.local_currency_balance,
> RF.local_currency_beg_balance,
> RF.local_currency_code ,
> RF.local_journal_balance,
> RF.local_journal_beg_balance,
> RF.local_journal_sum,
> RF.location,
> RF.ltd_amortization_amount,
> RF.post_to_gl,
> RF.principal_amount,
> RF.project,
> RF.quantity_sum,
> RF.rfd_id,
> RF.sales_channel,
> RF.sl_db_name,
> RF.source_system,
> RF.timezone_id
> FROM oldest_act_dt_previous_day_rf RF, date_range D;
Warning: Map Join MAPJOIN[7][bigTable=rf] in task 'Map 2' is a cross product
Query ID = hadoop_20160630181616_76699ea6-ed61-4866-b67b-d3f1af345103
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1467163153575_0026)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 KILLED 1 0 0 1 0 1
Map 2 KILLED 115 0 0 115 0 115
--------------------------------------------------------------------------------
VERTICES: 00/02 [>>--------------------------] 0% ELAPSED TIME: 5.54 s
--------------------------------------------------------------------------------
Status: Killed
Dag received [DAG_TERMINATE, DAG_KILL] in RUNNING state.
Kill Dag request received from client
Vertex killed, vertexName=Map 1, vertexId=vertex_1467163153575_0026_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1, Vertex vertex_1467163153575_0026_1_00 [Map 1] killed/failed due to:DAG_TERMINATED]
Vertex killed, vertexName=Map 2, vertexId=vertex_1467163153575_0026_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:115, Vertex vertex_1467163153575_0026_1_01 [Map 2] killed/failed due to:DAG_TERMINATED]
DAG did not succeed due to DAG_KILL. failedVertices:0 killedVertices:2
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
hive> set hive.execution.engine=mr;
hive>
>
> CREATE TABLE intial_starting_balance AS
> SELECT
> D.caldate as accounting_date_local,
> RF.account_number as account_number,
> RF.aggregation_type,
> RF.application_name,
> RF.company_code,
> RF.cost_center,
> RF.dimension01,
> RF.dimension02,
> RF.dimension03,
> RF.dimension04,
> RF.dimension05,
> RF.dimension06,
> RF.dimension07,
> RF.dimension08,
> RF.dimension09,
> RF.dimension10,
> RF.dimension11,
> RF.dimension12,
> RF.dimension13,
> RF.dimension14,
> RF.dimension15,
> RF.financial_event_type,
> RF.functional_currency_code,
> RF.func_currency_amt_sum,
> RF.func_currency_balance,
> RF.func_currency_beg_balance,
> RF.func_journal_balance,
> RF.func_journal_beg_balance,
> RF.func_journal_sum,
> RF.gl_group_id,
> RF.gl_product_line,
> RF.jl_description,
> RF.ledger_id,
> RF.local_currency_amt_sum,
> RF.local_currency_balance,
> RF.local_currency_beg_balance,
> RF.local_currency_code ,
> RF.local_journal_balance,
> RF.local_journal_beg_balance,
> RF.local_journal_sum,
> RF.location,
> RF.ltd_amortization_amount,
> RF.post_to_gl,
> RF.principal_amount,
> RF.project,
> RF.quantity_sum,
> RF.rfd_id,
> RF.sales_channel,
> RF.sl_db_name,
> RF.source_system,
> RF.timezone_id
> FROM oldest_act_dt_previous_day_rf RF, date_range D;
Warning: Map Join MAPJOIN[7][bigTable=rf] in task 'Stage-4:MAPRED' is a cross product
Query ID = hadoop_20160630181616_804cc05b-8103-45a9-93ff-4c258e995d58
Total jobs = 1
Execution log at: /mnt/tmp/hadoop/hadoop_20160630181616_804cc05b-8103-45a9-93ff-4c258e995d58.log
2016-06-30 06:17:04 Starting to launch local task to process map join; maximum memory = 932184064
2016-06-30 06:17:06 Dump the side-table for tag: 1 with group count: 1 into file: file:/mnt/tmp/hadoop/57e52e6d-3178-4e8e-be7f-101939fe11c2/hive_2016-06-30_18-16-59_589_3295892868228064144-1/-local-10003/HashTable-Stage-4/MapJoin-mapfile11--.hashtable
2016-06-30 06:17:06 Uploaded 1 File to: file:/mnt/tmp/hadoop/57e52e6d-3178-4e8e-be7f-101939fe11c2/hive_2016-06-30_18-16-59_589_3295892868228064144-1/-local-10003/HashTable-Stage-4/MapJoin-mapfile11--.hashtable (931 bytes)
2016-06-30 06:17:06 End of local task; Time Taken: 1.341 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1467163153575_0027, Tracking URL = http://ip-172-31-1-35.ec2.internal:20888/proxy/application_1467163153575_0027/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1467163153575_0027
Hadoop job information for Stage-4: number of mappers: 324; number of reducers: 0
2016-06-30 18:17:15,415 Stage-4 map = 0%, reduce = 0%
hive> EXPLAIN EXTENDED SELECT
> D.caldate as accounting_date_local,
> RF.account_number as account_number,
> RF.aggregation_type,
> RF.application_name,
> RF.company_code,
> RF.cost_center,
> RF.dimension01,
> RF.dimension02,
> RF.dimension03,
> RF.dimension04,
> RF.dimension05,
> RF.dimension06,
> RF.dimension07,
> RF.dimension08,
> RF.dimension09,
> RF.dimension10,
> RF.dimension11,
> RF.dimension12,
> RF.dimension13,
> RF.dimension14,
> RF.dimension15,
> RF.financial_event_type,
> RF.functional_currency_code,
> RF.func_currency_amt_sum,
> RF.func_currency_balance,
> RF.func_currency_beg_balance,
> RF.func_journal_balance,
> RF.func_journal_beg_balance,
> RF.func_journal_sum,
> RF.gl_group_id,
> RF.gl_product_line,
> RF.jl_description,
> RF.ledger_id,
> RF.local_currency_amt_sum,
> RF.local_currency_balance,
> RF.local_currency_beg_balance,
> RF.local_currency_code ,
> RF.local_journal_balance,
> RF.local_journal_beg_balance,
> RF.local_journal_sum,
> RF.location,
> RF.ltd_amortization_amount,
> RF.post_to_gl,
> RF.principal_amount,
> RF.project,
> RF.quantity_sum,
> RF.rfd_id,
> RF.sales_channel,
> RF.sl_db_name,
> RF.source_system,
> RF.timezone_id
> FROM oldest_act_dt_previous_day_rf RF, date_range D;
Warning: Map Join MAPJOIN[7][bigTable=rf] in task 'Map 2' is a cross product
OK
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_JOIN
TOK_TABREF
TOK_TABNAME
oldest_act_dt_previous_day_rf
RF
TOK_TABREF
TOK_TABNAME
date_range
D
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
.
TOK_TABLE_OR_COL
D
caldate
accounting_date_local
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
account_number
account_number
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
aggregation_type
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
application_name
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
company_code
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
cost_center
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension01
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension02
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension03
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension04
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension05
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension06
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension07
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension08
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension09
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension10
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension11
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension12
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension13
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension14
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
dimension15
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
financial_event_type
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
functional_currency_code
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_currency_amt_sum
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_currency_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_currency_beg_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_journal_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_journal_beg_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
func_journal_sum
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
gl_group_id
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
gl_product_line
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
jl_description
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
ledger_id
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_currency_amt_sum
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_currency_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_currency_beg_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_currency_code
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_journal_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_journal_beg_balance
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
local_journal_sum
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
location
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
ltd_amortization_amount
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
post_to_gl
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
principal_amount
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
project
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
quantity_sum
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
rfd_id
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
sales_channel
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
sl_db_name
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
source_system
TOK_SELEXPR
.
TOK_TABLE_OR_COL
RF
timezone_id
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Tez
Edges:
Map 2 <- Map 1 (BROADCAST_EDGE)
DagName: hadoop_20160630184242_e2719632-5ad7-4e8c-a92d-ea74daa27abb:2
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: d
Statistics: Num rows: 41 Data size: 3854 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Reduce Output Operator
sort order:
Statistics: Num rows: 41 Data size: 3854 Basic stats: COMPLETE Column stats: NONE
tag: 1
value expressions: caldate (type: string)
auto parallelism: false
Path -> Alias:
hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/date_range [d]
Path -> Partition:
hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/date_range
Partition
base file name: date_range
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns caldate
columns.comments
columns.types string
file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
location hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/date_range
name default.date_range
numFiles 1
numRows 41
rawDataSize 3854
serialization.ddl struct date_range { string caldate}
serialization.format 1
serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
totalSize 350
transient_lastDdlTime 1467166854
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns caldate
columns.comments
columns.types string
file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
location hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/date_range
name default.date_range
numFiles 1
numRows 41
rawDataSize 3854
serialization.ddl struct date_range { string caldate}
serialization.format 1
serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
totalSize 350
transient_lastDdlTime 1467166854
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: default.date_range
name: default.date_range
Truncated Path -> Alias:
/date_range [d]
Execution mode: vectorized
Map 2
Map Operator Tree:
TableScan
alias: rf
Statistics: Num rows: 87435021 Data size: 343639889520 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Map Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {account_number} {aggregation_type} {application_name} {company_code} {cost_center} {dimension01} {dimension02} {dimension03} {dimension04} {dimension05} {dimension06} {dimension07} {dimension08} {dimension09} {dimension10} {dimension11} {dimension12} {dimension13} {dimension14} {dimension15} {financial_event_type} {functional_currency_code} {func_currency_amt_sum} {func_currency_balance} {func_currency_beg_balance} {func_journal_balance} {func_journal_beg_balance} {func_journal_sum} {gl_group_id} {gl_product_line} {jl_description} {ledger_id} {local_currency_amt_sum} {local_currency_balance} {local_currency_beg_balance} {local_currency_code} {local_journal_balance} {local_journal_beg_balance} {local_journal_sum} {location} {ltd_amortization_amount} {post_to_gl} {principal_amount} {project} {quantity_sum} {rfd_id} {sales_channel} {sl_db_name} {source_system} {timezone_id}
1 {caldate}
Estimated key counts: Map 1 => 41
keys:
0
1
outputColumnNames: _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27, _col28, _col29, _col30, _col31, _col32, _col33, _col34, _col35, _col36, _col37, _col38, _col39, _col40, _col41, _col42, _col43, _col44, _col45, _col46, _col47, _col48, _col49, _col50, _col54
input vertices:
1 Map 1
Position of Big Table: 0
Statistics: Num rows: 113665523 Data size: 446731839989 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col54 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: string), _col7 (type: string), _col8 (type: string), _col9 (type: string), _col10 (type: string), _col11 (type: string), _col12 (type: string), _col13 (type: string), _col14 (type: string), _col15 (type: string), _col16 (type: string), _col17 (type: string), _col18 (type: string), _col19 (type: string), _col20 (type: string), _col21 (type: string), _col22 (type: string), _col23 (type: decimal(38,18)), _col24 (type: decimal(38,18)), _col25 (type: decimal(38,18)), _col26 (type: decimal(38,18)), _col27 (type: decimal(38,18)), _col28 (type: decimal(38,18)), _col29 (type: string), _col30 (type: string), _col31 (type: string), _col32 (type: string), _col33 (type: decimal(38,18)), _col34 (type: decimal(38,18)), _col35 (type: decimal(38,18)), _col36 (type: string), _col37 (type: decimal(38,18)), _col38 (type: decimal(38,18)), _col39 (type: decimal(38,18)), _col40 (type: string), _col41 (type: decimal(38,18)), _col42 (type: string), _col43 (type: decimal(38,18)), _col44 (type: string), _col45 (type: decimal(38,18)), _col46 (type: string), _col47 (type: string), _col48 (type: string), _col49 (type: string), _col50 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27, _col28, _col29, _col30, _col31, _col32, _col33, _col34, _col35, _col36, _col37, _col38, _col39, _col40, _col41, _col42, _col43, _col44, _col45, _col46, _col47, _col48, _col49, _col50
Statistics: Num rows: 113665523 Data size: 446731839989 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
GlobalTableId: 0
directory: hdfs://ip-172-31-1-35.ec2.internal:8020/tmp/hive/hadoop/daa51bd7-fade-41f1-b96d-e641f35e1bfc/hive_2016-06-30_18-42-00_307_2080077224343537923-1/-ext-10001
NumFilesPerFileSink: 1
Statistics: Num rows: 113665523 Data size: 446731839989 Basic stats: COMPLETE Column stats: NONE
Stats Publishing Key Prefix: hdfs://ip-172-31-1-35.ec2.internal:8020/tmp/hive/hadoop/daa51bd7-fade-41f1-b96d-e641f35e1bfc/hive_2016-06-30_18-42-00_307_2080077224343537923-1/-ext-10001/
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
columns _col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13,_col14,_col15,_col16,_col17,_col18,_col19,_col20,_col21,_col22,_col23,_col24,_col25,_col26,_col27,_col28,_col29,_col30,_col31,_col32,_col33,_col34,_col35,_col36,_col37,_col38,_col39,_col40,_col41,_col42,_col43,_col44,_col45,_col46,_col47,_col48,_col49,_col50
columns.types string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:string:string:string:string
escape.delim \
hive.serialization.extend.nesting.levels true
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
TotalFiles: 1
GatherStats: false
MultiFileSpray: false
Path -> Alias:
hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/oldest_act_dt_previous_day_rf [rf]
Path -> Partition:
hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/oldest_act_dt_previous_day_rf
Partition
base file name: oldest_act_dt_previous_day_rf
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns accounting_date_local,account_number,aggregation_type,application_name,company_code,cost_center,dimension01,dimension02,dimension03,dimension04,dimension05,dimension06,dimension07,dimension08,dimension09,dimension10,dimension11,dimension12,dimension13,dimension14,dimension15,financial_event_type,functional_currency_code,func_currency_amt_sum,func_currency_balance,func_currency_beg_balance,func_journal_balance,func_journal_beg_balance,func_journal_sum,gl_group_id,gl_product_line,jl_description,ledger_id,local_currency_amt_sum,local_currency_balance,local_currency_beg_balance,local_currency_code,local_journal_balance,local_journal_beg_balance,local_journal_sum,location,ltd_amortization_amount,post_to_gl,principal_amount,project,quantity_sum,rfd_id,sales_channel,sl_db_name,source_system,timezone_id
columns.comments
columns.types timestamp:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:string:string:string:string
file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
location hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/oldest_act_dt_previous_day_rf
name default.oldest_act_dt_previous_day_rf
numFiles 324
numRows 87435021
rawDataSize 343639889520
serialization.ddl struct oldest_act_dt_previous_day_rf { timestamp accounting_date_local, string account_number, string aggregation_type, string application_name, string company_code, string cost_center, string dimension01, string dimension02, string dimension03, string dimension04, string dimension05, string dimension06, string dimension07, string dimension08, string dimension09, string dimension10, string dimension11, string dimension12, string dimension13, string dimension14, string dimension15, string financial_event_type, string functional_currency_code, decimal(38,18) func_currency_amt_sum, decimal(38,18) func_currency_balance, decimal(38,18) func_currency_beg_balance, decimal(38,18) func_journal_balance, decimal(38,18) func_journal_beg_balance, decimal(38,18) func_journal_sum, string gl_group_id, string gl_product_line, string jl_description, string ledger_id, decimal(38,18) local_currency_amt_sum, decimal(38,18) local_currency_balance, decimal(38,18) local_currency_beg_balance, string local_currency_code, decimal(38,18) local_journal_balance, decimal(38,18) local_journal_beg_balance, decimal(38,18) local_journal_sum, string location, decimal(38,18) ltd_amortization_amount, string post_to_gl, decimal(38,18) principal_amount, string project, decimal(38,18) quantity_sum, string rfd_id, string sales_channel, string sl_db_name, string source_system, string timezone_id}
serialization.format 1
serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
totalSize 4575811884
transient_lastDdlTime 1467309711
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns accounting_date_local,account_number,aggregation_type,application_name,company_code,cost_center,dimension01,dimension02,dimension03,dimension04,dimension05,dimension06,dimension07,dimension08,dimension09,dimension10,dimension11,dimension12,dimension13,dimension14,dimension15,financial_event_type,functional_currency_code,func_currency_amt_sum,func_currency_balance,func_currency_beg_balance,func_journal_balance,func_journal_beg_balance,func_journal_sum,gl_group_id,gl_product_line,jl_description,ledger_id,local_currency_amt_sum,local_currency_balance,local_currency_beg_balance,local_currency_code,local_journal_balance,local_journal_beg_balance,local_journal_sum,location,ltd_amortization_amount,post_to_gl,principal_amount,project,quantity_sum,rfd_id,sales_channel,sl_db_name,source_system,timezone_id
columns.comments
columns.types timestamp:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):decimal(38,18):string:string:string:string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):decimal(38,18):decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:decimal(38,18):string:string:string:string:string
file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
location hdfs://ip-172-31-1-35.ec2.internal:8020/user/hive/warehouse/oldest_act_dt_previous_day_rf
name default.oldest_act_dt_previous_day_rf
numFiles 324
numRows 87435021
rawDataSize 343639889520
serialization.ddl struct oldest_act_dt_previous_day_rf { timestamp accounting_date_local, string account_number, string aggregation_type, string application_name, string company_code, string cost_center, string dimension01, string dimension02, string dimension03, string dimension04, string dimension05, string dimension06, string dimension07, string dimension08, string dimension09, string dimension10, string dimension11, string dimension12, string dimension13, string dimension14, string dimension15, string financial_event_type, string functional_currency_code, decimal(38,18) func_currency_amt_sum, decimal(38,18) func_currency_balance, decimal(38,18) func_currency_beg_balance, decimal(38,18) func_journal_balance, decimal(38,18) func_journal_beg_balance, decimal(38,18) func_journal_sum, string gl_group_id, string gl_product_line, string jl_description, string ledger_id, decimal(38,18) local_currency_amt_sum, decimal(38,18) local_currency_balance, decimal(38,18) local_currency_beg_balance, string local_currency_code, decimal(38,18) local_journal_balance, decimal(38,18) local_journal_beg_balance, decimal(38,18) local_journal_sum, string location, decimal(38,18) ltd_amortization_amount, string post_to_gl, decimal(38,18) principal_amount, string project, decimal(38,18) quantity_sum, string rfd_id, string sales_channel, string sl_db_name, string source_system, string timezone_id}
serialization.format 1
serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde
totalSize 4575811884
transient_lastDdlTime 1467309711
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: default.oldest_act_dt_previous_day_rf
name: default.oldest_act_dt_previous_day_rf
Truncated Path -> Alias:
/oldest_act_dt_previous_day_rf [rf]
Execution mode: vectorized
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink