You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by satyajit vegesna <sa...@gmail.com> on 2016/10/26 00:46:04 UTC
Error with flush_length File in Orc, in hive 2.1.0 and mr execution engine.
HI All,
i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries like
simple insert,
set mapreduce.job.queuename=default;set
hive.exec.dynamic.partition=true;set
hive.exec.dynamic.partition.mode=nonstrict;set
hive.exec.max.dynamic.partitions.pernode=400;set
hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
mapreduce.reduce.memory.mb=5120;set
mapred.tasktracker.map.tasks.maximum=30;set
mapred.tasktracker.reduce.tasks.maximum=20;set
mapred.reduce.child.java.opts=-Xmx2048m;set
mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set
hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set
mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO
access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as
prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
(pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
then (case when (pra.primary_listing_id is null or
TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
<> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
acl.httpstatuscode, acl.httpverb, acl.requesttime,
acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
AS user_agent, substring(split(pra.url,'/')[0],
0,length(split(pra.url,'/')[0])-3) as city,
substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner join
mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url
left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
ml.id left outer join attribute a on a.id = ml.standard_status WHERE
acl.accesstimedate="2016-10-23";
i finally end up getting below error,
2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
1478.96 sec
2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
1636.58 sec
2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
1764.97 sec
2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
1879.61 sec
2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
2051.38 sec
2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
2151.49 sec
2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
2179.67 sec
2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
2187.86 sec
2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
2205.22 sec
2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU
2241.25 sec
MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
Ended Job = job_1477437520637_0009
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type
[org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-25 17:40:35 Starting to launch local task to process map join; maximum
memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-14
Logs:
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
Read: 8060555201 HDFS Write: 757253756 SUCCESS
Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
Read: 2407914653 HDFS Write: 805874953 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
msec
Could not find any errors in logs, but when i check namenode logs , oi get
the following error,
2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.132:43008 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47260 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.132:43090 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.134:37444 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47300 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 8 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.134:37540 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 5 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47378 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist:
/user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
i also search for find the flush_length files in the above mentioned
location, but i only see buckets but no files ending with flush_length.
Any clue or help would be highly appreciated.
Regards,
Satyajit.
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
One more observation, in namenode logs when i run select count(*) on
individual tables i still see the same error as before,
org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist: /user/hive/warehouse/mls_public_record_association_
snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
but i get the count of the tables and they match well with the source data.
So i believe the problem is with joining these tables together.
Any specific logs you want me to debug.
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 4:16 PM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:
> Hi Eugene,
>
> select count(*) from mls_public_record_association_snapshot_orc pra left
> outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
> outer join attribute a on a.id = ml.standard_status
>
> ran till end and threw the below exception.
>
> MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds
> 760 msec
> Ended Job = job_1477494091659_0024
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.
> 1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.
> Log4jLoggerFactory]
> 2016-10-26 16:09:01 Starting to launch local task to process map join; maximum
> memory = 514850816
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
> Task ID:
> Stage-9
>
> Logs:
>
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapredLocalTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 300 Reduce: 121 Cumulative CPU: 3654.02 sec HDFS
> Read: 1771032233 HDFS Write: 1917532703 SUCCESS
> Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 msec
>
> Explain Plan:
>
> STAGE DEPENDENCIES:
> Stage-8 is a root stage , consists of Stage-1
> Stage-1
> Stage-9 depends on stages: Stage-1
> Stage-3 depends on stages: Stage-9
> Stage-0 depends on stages: Stage-3
>
> STAGE PLANS:
> Stage: Stage-8
> Conditional Operator
>
> Stage: Stage-1
> Map Reduce
> Map Operator Tree:
> TableScan
> alias: pra
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: primary_listing_id (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: string)
> sort order: +
> Map-reduce partition columns: _col0 (type: string)
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> TableScan
> alias: ml
> Statistics: Num rows: 201432950 Data size: 20949026816 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: id (type: string), standard_status (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 201432950 Data size: 20949026816 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: string)
> sort order: +
> Map-reduce partition columns: _col0 (type: string)
> Statistics: Num rows: 201432950 Data size: 20949026816
> Basic stats: COMPLETE Column stats: NONE
> value expressions: _col1 (type: int)
> Reduce Operator Tree:
> Join Operator
> condition map:
> Left Outer Join0 to 1
> keys:
> 0 _col0 (type: string)
> 1 _col0 (type: string)
> outputColumnNames: _col2
> Statistics: Num rows: 221576249 Data size: 23043929997 Basic
> stats: COMPLETE Column stats: NONE
> File Output Operator
> compressed: false
> table:
> input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
> output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.
> lazybinary.LazyBinarySerDe
>
> Stage: Stage-9 --it is failing in the same mapreduce local work.
> Map Reduce Local Work
> Alias -> Map Local Tables:
> $hdt$_2:a
> Fetch Operator
> limit: -1
> Alias -> Map Local Operator Tree:
> $hdt$_2:a
> TableScan
> alias: a
> Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: id (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
> HashTable Sink Operator
> keys:
> 0 _col2 (type: int)
> 1 _col0 (type: int)
>
> Stage: Stage-3
> Map Reduce
> Map Operator Tree:
> TableScan
> Map Join Operator
> condition map:
> Left Outer Join0 to 1
> keys:
> 0 _col2 (type: int)
> 1 _col0 (type: int)
> Statistics: Num rows: 243733879 Data size: 25348323546 Basic
> stats: COMPLETE Column stats: NONE
> Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> Reduce Output Operator
> sort order:
> Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: bigint)
> Local Work:
> Map Reduce Local Work
> Reduce Operator Tree:
> Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
> output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> ListSink
>
> Any suggestion in debugging this issue is appreciated.
>
>
> Regards,
> Satyajit.
>
>
>
>
> On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> If you can run this, then it’s safe to ignore “00094_flush_length” messages
>> and the issue is somewhere else
>>
>> select count(*) from mls_public_record_association_snapshot_orc pra
>> left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
>> ml.id left outer join attribute a on a.id = ml.standard_status
>>
>> Eugene
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Date: Wednesday, October 26, 2016 at 2:14 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, Eugene Koifman <
>> ekoifman@hortonworks.com>
>> Cc: "dev@hive.apache.org" <de...@hive.apache.org>
>> Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> Hi Eugene,
>>
>> PFB Transaction table in green and parquet tables in yellow,
>>
>> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23")
>> select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS
>> timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>>
>> Any clue, or something that you would want me to focus on to debug the
>> issue.
>>
>> Regards,
>> Satyajit.
>>
>>
>>
>> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ekoifman@hortonworks.com
>> > wrote:
>>
>>> Which of your tables are are transactional? Can you provide the DDL?
>>>
>>> I don’t think “File does not exist” error is causing your queries to
>>> fail. It’s an INFO level msg.
>>> There should be some other error.
>>>
>>> Eugene
>>>
>>>
>>> From: satyajit vegesna <sa...@gmail.com>
>>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>>> Date: Tuesday, October 25, 2016 at 5:46 PM
>>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>>> <de...@hive.apache.org>
>>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>>> execution engine.
>>>
>>> HI All,
>>>
>>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>>> like simple insert,
>>>
>>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>>> hive.exec.dynamic.partition.mode=nonstrict;set
>>> hive.exec.max.dynamic.partitions.pernode=400;set
>>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>>> mapred.tasktracker.reduce.tasks.maximum=20;set
>>> mapred.reduce.child.java.opts=-Xmx2048m;set
>>> mapred.map.child.java.opts=-Xmx2048m; set
>>> hive.support.concurrency=true; set hive.txn.manager=org.apache.ha
>>> doop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false;
>>> set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set
>>> hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT
>>> INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url
>>> as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as
>>> prId, acl.accesstime as crawledon, pra.id as propId,
>>> pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS
>>> timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(pra.created_at AS timestamp)) as prcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS
>>> timestamp)) as prupdateage, (case when (pra.public_record_id is not null
>>> and TRIM(pra.public_record_id) <> '') then (case when
>>> (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then
>>> 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null
>>> and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END)
>>> as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime,
>>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>>> AS user_agent, substring(split(pra.url,'/')[0],
>>> 0,length(split(pra.url,'/')[0])-3) as city,
>>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>>> join mls_public_record_association_snapshot_orc pra on acl.listing_url
>>> = pra.url left outer join mls_listing_snapshot_orc ml on
>>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>>
>>> i finally end up getting below error,
>>>
>>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative
>>> CPU 1478.96 sec
>>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative
>>> CPU 1636.58 sec
>>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative
>>> CPU 1764.97 sec
>>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative
>>> CPU 1879.61 sec
>>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative
>>> CPU 2051.38 sec
>>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative
>>> CPU 2151.49 sec
>>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative
>>> CPU 2179.67 sec
>>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative
>>> CPU 2187.86 sec
>>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative
>>> CPU 2205.22 sec
>>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>>> CPU 2241.25 sec
>>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>>> Ended Job = job_1477437520637_0009
>>> SLF4J: Class path contains multiple SLF4J bindings.
>>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>>> LoggerBinder.class]
>>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>>> pl/StaticLoggerBinder.class]
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>> explanation.
>>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>>> jLoggerFactory]
>>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>>> memory = 514850816
>>> Execution failed with exit status: 2
>>> Obtaining error information
>>>
>>> Task failed!
>>> Task ID:
>>> Stage-14
>>>
>>> Logs:
>>>
>>> FAILED: Execution Error, return code 2 from
>>> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
>>> MapReduce Jobs Launched:
>>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>>> msec
>>>
>>> Could not find any errors in logs, but when i check namenode logs , oi
>>> get the following error,
>>>
>>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>>
>>> i also search for find the flush_length files in the above mentioned
>>> location, but i only see buckets but no files ending with flush_length.
>>>
>>> Any clue or help would be highly appreciated.
>>>
>>> Regards,
>>> Satyajit.
>>>
>>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
One more observation, in namenode logs when i run select count(*) on
individual tables i still see the same error as before,
org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist: /user/hive/warehouse/mls_public_record_association_
snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
but i get the count of the tables and they match well with the source data.
So i believe the problem is with joining these tables together.
Any specific logs you want me to debug.
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 4:16 PM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:
> Hi Eugene,
>
> select count(*) from mls_public_record_association_snapshot_orc pra left
> outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
> outer join attribute a on a.id = ml.standard_status
>
> ran till end and threw the below exception.
>
> MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds
> 760 msec
> Ended Job = job_1477494091659_0024
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.
> 1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.
> Log4jLoggerFactory]
> 2016-10-26 16:09:01 Starting to launch local task to process map join; maximum
> memory = 514850816
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
> Task ID:
> Stage-9
>
> Logs:
>
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapredLocalTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 300 Reduce: 121 Cumulative CPU: 3654.02 sec HDFS
> Read: 1771032233 HDFS Write: 1917532703 SUCCESS
> Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 msec
>
> Explain Plan:
>
> STAGE DEPENDENCIES:
> Stage-8 is a root stage , consists of Stage-1
> Stage-1
> Stage-9 depends on stages: Stage-1
> Stage-3 depends on stages: Stage-9
> Stage-0 depends on stages: Stage-3
>
> STAGE PLANS:
> Stage: Stage-8
> Conditional Operator
>
> Stage: Stage-1
> Map Reduce
> Map Operator Tree:
> TableScan
> alias: pra
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: primary_listing_id (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: string)
> sort order: +
> Map-reduce partition columns: _col0 (type: string)
> Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
> TableScan
> alias: ml
> Statistics: Num rows: 201432950 Data size: 20949026816 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: id (type: string), standard_status (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 201432950 Data size: 20949026816 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: string)
> sort order: +
> Map-reduce partition columns: _col0 (type: string)
> Statistics: Num rows: 201432950 Data size: 20949026816
> Basic stats: COMPLETE Column stats: NONE
> value expressions: _col1 (type: int)
> Reduce Operator Tree:
> Join Operator
> condition map:
> Left Outer Join0 to 1
> keys:
> 0 _col0 (type: string)
> 1 _col0 (type: string)
> outputColumnNames: _col2
> Statistics: Num rows: 221576249 Data size: 23043929997 Basic
> stats: COMPLETE Column stats: NONE
> File Output Operator
> compressed: false
> table:
> input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
> output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.
> lazybinary.LazyBinarySerDe
>
> Stage: Stage-9 --it is failing in the same mapreduce local work.
> Map Reduce Local Work
> Alias -> Map Local Tables:
> $hdt$_2:a
> Fetch Operator
> limit: -1
> Alias -> Map Local Operator Tree:
> $hdt$_2:a
> TableScan
> alias: a
> Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: id (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
> HashTable Sink Operator
> keys:
> 0 _col2 (type: int)
> 1 _col0 (type: int)
>
> Stage: Stage-3
> Map Reduce
> Map Operator Tree:
> TableScan
> Map Join Operator
> condition map:
> Left Outer Join0 to 1
> keys:
> 0 _col2 (type: int)
> 1 _col0 (type: int)
> Statistics: Num rows: 243733879 Data size: 25348323546 Basic
> stats: COMPLETE Column stats: NONE
> Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> Reduce Output Operator
> sort order:
> Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: bigint)
> Local Work:
> Map Reduce Local Work
> Reduce Operator Tree:
> Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
> output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> ListSink
>
> Any suggestion in debugging this issue is appreciated.
>
>
> Regards,
> Satyajit.
>
>
>
>
> On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> If you can run this, then it’s safe to ignore “00094_flush_length” messages
>> and the issue is somewhere else
>>
>> select count(*) from mls_public_record_association_snapshot_orc pra
>> left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
>> ml.id left outer join attribute a on a.id = ml.standard_status
>>
>> Eugene
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Date: Wednesday, October 26, 2016 at 2:14 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, Eugene Koifman <
>> ekoifman@hortonworks.com>
>> Cc: "dev@hive.apache.org" <de...@hive.apache.org>
>> Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> Hi Eugene,
>>
>> PFB Transaction table in green and parquet tables in yellow,
>>
>> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23")
>> select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS
>> timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>>
>> Any clue, or something that you would want me to focus on to debug the
>> issue.
>>
>> Regards,
>> Satyajit.
>>
>>
>>
>> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ekoifman@hortonworks.com
>> > wrote:
>>
>>> Which of your tables are are transactional? Can you provide the DDL?
>>>
>>> I don’t think “File does not exist” error is causing your queries to
>>> fail. It’s an INFO level msg.
>>> There should be some other error.
>>>
>>> Eugene
>>>
>>>
>>> From: satyajit vegesna <sa...@gmail.com>
>>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>>> Date: Tuesday, October 25, 2016 at 5:46 PM
>>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>>> <de...@hive.apache.org>
>>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>>> execution engine.
>>>
>>> HI All,
>>>
>>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>>> like simple insert,
>>>
>>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>>> hive.exec.dynamic.partition.mode=nonstrict;set
>>> hive.exec.max.dynamic.partitions.pernode=400;set
>>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>>> mapred.tasktracker.reduce.tasks.maximum=20;set
>>> mapred.reduce.child.java.opts=-Xmx2048m;set
>>> mapred.map.child.java.opts=-Xmx2048m; set
>>> hive.support.concurrency=true; set hive.txn.manager=org.apache.ha
>>> doop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false;
>>> set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set
>>> hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT
>>> INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url
>>> as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as
>>> prId, acl.accesstime as crawledon, pra.id as propId,
>>> pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS
>>> timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(pra.created_at AS timestamp)) as prcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS
>>> timestamp)) as prupdateage, (case when (pra.public_record_id is not null
>>> and TRIM(pra.public_record_id) <> '') then (case when
>>> (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then
>>> 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null
>>> and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END)
>>> as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime,
>>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>>> AS user_agent, substring(split(pra.url,'/')[0],
>>> 0,length(split(pra.url,'/')[0])-3) as city,
>>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>>> join mls_public_record_association_snapshot_orc pra on acl.listing_url
>>> = pra.url left outer join mls_listing_snapshot_orc ml on
>>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>>
>>> i finally end up getting below error,
>>>
>>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative
>>> CPU 1478.96 sec
>>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative
>>> CPU 1636.58 sec
>>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative
>>> CPU 1764.97 sec
>>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative
>>> CPU 1879.61 sec
>>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative
>>> CPU 2051.38 sec
>>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative
>>> CPU 2151.49 sec
>>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative
>>> CPU 2179.67 sec
>>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative
>>> CPU 2187.86 sec
>>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative
>>> CPU 2205.22 sec
>>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>>> CPU 2241.25 sec
>>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>>> Ended Job = job_1477437520637_0009
>>> SLF4J: Class path contains multiple SLF4J bindings.
>>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>>> LoggerBinder.class]
>>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>>> pl/StaticLoggerBinder.class]
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>> explanation.
>>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>>> jLoggerFactory]
>>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>>> memory = 514850816
>>> Execution failed with exit status: 2
>>> Obtaining error information
>>>
>>> Task failed!
>>> Task ID:
>>> Stage-14
>>>
>>> Logs:
>>>
>>> FAILED: Execution Error, return code 2 from
>>> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
>>> MapReduce Jobs Launched:
>>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>>> msec
>>>
>>> Could not find any errors in logs, but when i check namenode logs , oi
>>> get the following error,
>>>
>>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>>
>>> i also search for find the flush_length files in the above mentioned
>>> location, but i only see buckets but no files ending with flush_length.
>>>
>>> Any clue or help would be highly appreciated.
>>>
>>> Regards,
>>> Satyajit.
>>>
>>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
select count(*) from mls_public_record_association_snapshot_orc pra left
outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
outer join attribute a on a.id = ml.standard_status
ran till end and threw the below exception.
MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds
760 msec
Ended Job = job_1477494091659_0024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type
[org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-26 16:09:01 Starting to launch local task to process map join; maximum
memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-9
Logs:
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 300 Reduce: 121 Cumulative CPU: 3654.02 sec HDFS
Read: 1771032233 HDFS Write: 1917532703 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 msec
Explain Plan:
STAGE DEPENDENCIES:
Stage-8 is a root stage , consists of Stage-1
Stage-1
Stage-9 depends on stages: Stage-1
Stage-3 depends on stages: Stage-9
Stage-0 depends on stages: Stage-3
STAGE PLANS:
Stage: Stage-8
Conditional Operator
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: pra
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: primary_listing_id (type: string)
outputColumnNames: _col0
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
TableScan
alias: ml
Statistics: Num rows: 201432950 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), standard_status (type: int)
outputColumnNames: _col0, _col1
Statistics: Num rows: 201432950 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 201432950 Data size: 20949026816
Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col2
Statistics: Num rows: 221576249 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-9 --it is failing in the same mapreduce local work.
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_2:a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_2:a
TableScan
alias: a
Statistics: Num rows: 12830 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int)
outputColumnNames: _col0
Statistics: Num rows: 12830 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col2 (type: int)
1 _col0 (type: int)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col2 (type: int)
1 _col0 (type: int)
Statistics: Num rows: 243733879 Data size: 25348323546 Basic
stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Local Work:
Map Reduce Local Work
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column
stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Any suggestion in debugging this issue is appreciated.
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <ek...@hortonworks.com>
wrote:
> If you can run this, then it’s safe to ignore “00094_flush_length” messages
> and the issue is somewhere else
>
> select count(*) from mls_public_record_association_snapshot_orc pra left
> outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
> outer join attribute a on a.id = ml.standard_status
>
> Eugene
>
> From: satyajit vegesna <sa...@gmail.com>
> Date: Wednesday, October 26, 2016 at 2:14 PM
> To: "user@hive.apache.org" <us...@hive.apache.org>, Eugene Koifman <
> ekoifman@hortonworks.com>
> Cc: "dev@hive.apache.org" <de...@hive.apache.org>
> Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
> execution engine.
>
> Hi Eugene,
>
> PFB Transaction table in green and parquet tables in yellow,
>
> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
> pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
> as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
>
> Any clue, or something that you would want me to focus on to debug the
> issue.
>
> Regards,
> Satyajit.
>
>
>
> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> Which of your tables are are transactional? Can you provide the DDL?
>>
>> I don’t think “File does not exist” error is causing your queries to
>> fail. It’s an INFO level msg.
>> There should be some other error.
>>
>> Eugene
>>
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>> Date: Tuesday, October 25, 2016 at 5:46 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>> <de...@hive.apache.org>
>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> HI All,
>>
>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>> like simple insert,
>>
>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;set
>> hive.exec.max.dynamic.partitions.pernode=400;set
>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>> mapred.tasktracker.reduce.tasks.maximum=20;set
>> mapred.reduce.child.java.opts=-Xmx2048m;set
>> mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true;
>> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set
>> hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set
>> mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO
>> access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as
>> prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>> i finally end up getting below error,
>>
>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
>> 1478.96 sec
>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
>> 1636.58 sec
>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
>> 1764.97 sec
>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
>> 1879.61 sec
>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
>> 2051.38 sec
>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
>> 2151.49 sec
>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
>> 2179.67 sec
>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
>> 2187.86 sec
>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
>> 2205.22 sec
>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>> CPU 2241.25 sec
>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>> Ended Job = job_1477437520637_0009
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>> LoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>> pl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>> jLoggerFactory]
>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>> memory = 514850816
>> Execution failed with exit status: 2
>> Obtaining error information
>>
>> Task failed!
>> Task ID:
>> Stage-14
>>
>> Logs:
>>
>> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec
>> .mr.MapredLocalTask
>> MapReduce Jobs Launched:
>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>> msec
>>
>> Could not find any errors in logs, but when i check namenode logs , oi
>> get the following error,
>>
>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>
>> i also search for find the flush_length files in the above mentioned
>> location, but i only see buckets but no files ending with flush_length.
>>
>> Any clue or help would be highly appreciated.
>>
>> Regards,
>> Satyajit.
>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
select count(*) from mls_public_record_association_snapshot_orc pra left
outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
outer join attribute a on a.id = ml.standard_status
ran till end and threw the below exception.
MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds
760 msec
Ended Job = job_1477494091659_0024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type
[org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-26 16:09:01 Starting to launch local task to process map join; maximum
memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-9
Logs:
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 300 Reduce: 121 Cumulative CPU: 3654.02 sec HDFS
Read: 1771032233 HDFS Write: 1917532703 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 msec
Explain Plan:
STAGE DEPENDENCIES:
Stage-8 is a root stage , consists of Stage-1
Stage-1
Stage-9 depends on stages: Stage-1
Stage-3 depends on stages: Stage-9
Stage-0 depends on stages: Stage-3
STAGE PLANS:
Stage: Stage-8
Conditional Operator
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: pra
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: primary_listing_id (type: string)
outputColumnNames: _col0
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 99241216 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
TableScan
alias: ml
Statistics: Num rows: 201432950 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), standard_status (type: int)
outputColumnNames: _col0, _col1
Statistics: Num rows: 201432950 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 201432950 Data size: 20949026816
Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col2
Statistics: Num rows: 221576249 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-9 --it is failing in the same mapreduce local work.
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_2:a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_2:a
TableScan
alias: a
Statistics: Num rows: 12830 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int)
outputColumnNames: _col0
Statistics: Num rows: 12830 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col2 (type: int)
1 _col0 (type: int)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col2 (type: int)
1 _col0 (type: int)
Statistics: Num rows: 243733879 Data size: 25348323546 Basic
stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Local Work:
Map Reduce Local Work
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column
stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Any suggestion in debugging this issue is appreciated.
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <ek...@hortonworks.com>
wrote:
> If you can run this, then it’s safe to ignore “00094_flush_length” messages
> and the issue is somewhere else
>
> select count(*) from mls_public_record_association_snapshot_orc pra left
> outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left
> outer join attribute a on a.id = ml.standard_status
>
> Eugene
>
> From: satyajit vegesna <sa...@gmail.com>
> Date: Wednesday, October 26, 2016 at 2:14 PM
> To: "user@hive.apache.org" <us...@hive.apache.org>, Eugene Koifman <
> ekoifman@hortonworks.com>
> Cc: "dev@hive.apache.org" <de...@hive.apache.org>
> Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
> execution engine.
>
> Hi Eugene,
>
> PFB Transaction table in green and parquet tables in yellow,
>
> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
> pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
> as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
>
> Any clue, or something that you would want me to focus on to debug the
> issue.
>
> Regards,
> Satyajit.
>
>
>
> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> Which of your tables are are transactional? Can you provide the DDL?
>>
>> I don’t think “File does not exist” error is causing your queries to
>> fail. It’s an INFO level msg.
>> There should be some other error.
>>
>> Eugene
>>
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>> Date: Tuesday, October 25, 2016 at 5:46 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>> <de...@hive.apache.org>
>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> HI All,
>>
>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>> like simple insert,
>>
>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;set
>> hive.exec.max.dynamic.partitions.pernode=400;set
>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>> mapred.tasktracker.reduce.tasks.maximum=20;set
>> mapred.reduce.child.java.opts=-Xmx2048m;set
>> mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true;
>> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set
>> hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set
>> mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO
>> access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as
>> prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>> i finally end up getting below error,
>>
>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
>> 1478.96 sec
>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
>> 1636.58 sec
>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
>> 1764.97 sec
>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
>> 1879.61 sec
>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
>> 2051.38 sec
>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
>> 2151.49 sec
>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
>> 2179.67 sec
>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
>> 2187.86 sec
>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
>> 2205.22 sec
>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>> CPU 2241.25 sec
>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>> Ended Job = job_1477437520637_0009
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>> LoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>> pl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>> jLoggerFactory]
>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>> memory = 514850816
>> Execution failed with exit status: 2
>> Obtaining error information
>>
>> Task failed!
>> Task ID:
>> Stage-14
>>
>> Logs:
>>
>> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec
>> .mr.MapredLocalTask
>> MapReduce Jobs Launched:
>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>> msec
>>
>> Could not find any errors in logs, but when i check namenode logs , oi
>> get the following error,
>>
>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>
>> i also search for find the flush_length files in the above mentioned
>> location, but i only see buckets but no files ending with flush_length.
>>
>> Any clue or help would be highly appreciated.
>>
>> Regards,
>> Satyajit.
>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
PFB EXPLAIN PLAN from hive, and query has issues at stage-14,
i have also run another query which also fail in stage, *Map Reduce Local
Work.*
STAGE DEPENDENCIES:
Stage-13 is a root stage , consists of Stage-1
Stage-1
Stage-10 depends on stages: Stage-1 , consists of Stage-15, Stage-2
Stage-15 has a backup stage: Stage-2
Stage-9 depends on stages: Stage-15
Stage-14 depends on stages: Stage-2, Stage-9
Stage-8 depends on stages: Stage-14
Stage-0 depends on stages: Stage-8
Stage-4 depends on stages: Stage-0
Stage-2
STAGE PLANS:
Stage: Stage-13
Conditional Operator
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: acl
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Filter Operator
predicate: listing_url is not null (type: boolean)
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: httpstatuscode (type: string), accesstime
(type: timestamp), useragent (type: string), page_id (type: string),
listing_url (type: string), requesttime (type: string), upstreamheadertime
(type: string), upstreamresponsetime (type: string), httpverb (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col4 (type: string)
sort order: +
Map-reduce partition columns: _col4 (type: string)
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
timestamp), _col2 (type: string), _col3 (type: string), _col5 (type:
string), _col6 (type: string), _col7 (type: string), _col8 (type: string)
TableScan
alias: pra
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Filter Operator
predicate: url is not null (type: boolean)
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), primary_listing_id (type:
string), public_record_id (type: string), url_type (type: string), url
(type: string), created_at (type: bigint), updated_at (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col4 (type: string)
sort order: +
Map-reduce partition columns: _col4 (type: string)
Statistics: Num rows: 19232793 Data size: 9924121600
Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
string), _col2 (type: string), _col3 (type: string), _col5 (type: bigint),
_col6 (type: bigint)
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col4 (type: string)
1 _col4 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16
Statistics: Num rows: 21156072 Data size: 10916533996 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-10
Conditional Operator
Stage: Stage-15
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_2:ml
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_2:ml
TableScan
alias: ml
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), mls_id (type: int),
standard_status (type: int), updated_at (type: bigint), created_at (type:
bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col11 (type: string)
1 _col0 (type: string)
Stage: Stage-9
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col11 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col19, _col20, _col21
Statistics: Num rows: 185838144 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Local Work:
Map Reduce Local Work
Stage: Stage-14
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_3:a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_3:a
TableScan
alias: a
Statistics: Num rows: 493 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int), name (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 493 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col19 (type: int)
1 _col0 (type: int)
Stage: Stage-8
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col19 (type: int)
1 _col0 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col20, _col21, _col23
Statistics: Num rows: 204421962 Data size: 25348323546 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col14 (type: string), _col13 (type: string),
CAST( _col15 AS TIMESTAMP) (type: timestamp), CAST( _col16 AS TIMESTAMP)
(type: timestamp), CAST( _col21 AS TIMESTAMP) (type: timestamp), CAST(
_col20 AS TIMESTAMP) (type: timestamp), _col23 (type: string), _col12
(type: string), _col1 (type: timestamp), _col10 (type: string), _col11
(type: string), datediff(_col1, CAST( _col21 AS TIMESTAMP)) (type: int),
datediff(_col1, CAST( _col20 AS TIMESTAMP)) (type: int), datediff(_col1,
CAST( _col15 AS TIMESTAMP)) (type: int), datediff(_col1, CAST( _col16 AS
TIMESTAMP)) (type: int), CASE WHEN ((_col12 is not null and (trim(_col12)
<> ''))) THEN (CASE WHEN ((_col11 is null or (trim(_col11) = ''))) THEN
('PR') ELSE ('PRMLS') END) ELSE (CASE WHEN ((_col11 is not null and
(trim(_col11) <> ''))) THEN ('MLS') ELSE ('UNKNOWN') END) END (type:
string), _col0 (type: string), _col8 (type: string), _col5 (type: string),
_col6 (type: string), _col7 (type: string), _col3 (type: string), _col2
(type: string), substring(split(_col14, '/')[0], 0, (length(split(_col14,
'/')[0]) - 3)) (type: string), substring(split(_col14, '/')[0],
(length(split(_col14, '/')[0]) - 1), 2) (type: string), _col18 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14,
_col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23,
_col24, _col25
Statistics: Num rows: 204421962 Data size: 25348323546
Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 204421962 Data size: 25348323546
Basic stats: COMPLETE Column stats: NONE
table:
input format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
output format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
serde:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
name: access_logs.crawlstats_dpp
Local Work:
Map Reduce Local Work
Stage: Stage-0
Move Operator
tables:
partition:
day 2016-10-23
replace: false
table:
input format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
output format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
serde:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
name: access_logs.crawlstats_dpp
Stage: Stage-4
Stats-Aggr Operator
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
Reduce Output Operator
key expressions: _col11 (type: string)
sort order: +
Map-reduce partition columns: _col11 (type: string)
Statistics: Num rows: 21156072 Data size: 10916533996 Basic
stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
timestamp), _col2 (type: string), _col3 (type: string), _col5 (type:
string), _col6 (type: string), _col7 (type: string), _col8 (type: string),
_col10 (type: string), _col12 (type: string), _col13 (type: string), _col14
(type: string), _col15 (type: bigint), _col16 (type: bigint)
TableScan
alias: ml
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), mls_id (type: int),
standard_status (type: int), updated_at (type: bigint), created_at (type:
bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 168943764 Data size: 20949026816
Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int),
_col3 (type: bigint), _col4 (type: bigint)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col11 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col19, _col20, _col21
Statistics: Num rows: 185838144 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 2:14 PM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:
> Hi Eugene,
>
> PFB Transaction table in green and parquet tables in yellow,
>
> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
> pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
> as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
>
> Any clue, or something that you would want me to focus on to debug the
> issue.
>
> Regards,
> Satyajit.
>
>
>
> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> Which of your tables are are transactional? Can you provide the DDL?
>>
>> I don’t think “File does not exist” error is causing your queries to
>> fail. It’s an INFO level msg.
>> There should be some other error.
>>
>> Eugene
>>
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>> Date: Tuesday, October 25, 2016 at 5:46 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>> <de...@hive.apache.org>
>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> HI All,
>>
>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>> like simple insert,
>>
>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;set
>> hive.exec.max.dynamic.partitions.pernode=400;set
>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>> mapred.tasktracker.reduce.tasks.maximum=20;set
>> mapred.reduce.child.java.opts=-Xmx2048m;set
>> mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true;
>> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set
>> hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set
>> mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO
>> access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as
>> prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>> i finally end up getting below error,
>>
>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
>> 1478.96 sec
>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
>> 1636.58 sec
>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
>> 1764.97 sec
>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
>> 1879.61 sec
>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
>> 2051.38 sec
>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
>> 2151.49 sec
>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
>> 2179.67 sec
>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
>> 2187.86 sec
>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
>> 2205.22 sec
>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>> CPU 2241.25 sec
>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>> Ended Job = job_1477437520637_0009
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>> LoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>> pl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>> jLoggerFactory]
>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>> memory = 514850816
>> Execution failed with exit status: 2
>> Obtaining error information
>>
>> Task failed!
>> Task ID:
>> Stage-14
>>
>> Logs:
>>
>> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec
>> .mr.MapredLocalTask
>> MapReduce Jobs Launched:
>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>> msec
>>
>> Could not find any errors in logs, but when i check namenode logs , oi
>> get the following error,
>>
>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>
>> i also search for find the flush_length files in the above mentioned
>> location, but i only see buckets but no files ending with flush_length.
>>
>> Any clue or help would be highly appreciated.
>>
>> Regards,
>> Satyajit.
>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
PFB EXPLAIN PLAN from hive, and query has issues at stage-14,
i have also run another query which also fail in stage, *Map Reduce Local
Work.*
STAGE DEPENDENCIES:
Stage-13 is a root stage , consists of Stage-1
Stage-1
Stage-10 depends on stages: Stage-1 , consists of Stage-15, Stage-2
Stage-15 has a backup stage: Stage-2
Stage-9 depends on stages: Stage-15
Stage-14 depends on stages: Stage-2, Stage-9
Stage-8 depends on stages: Stage-14
Stage-0 depends on stages: Stage-8
Stage-4 depends on stages: Stage-0
Stage-2
STAGE PLANS:
Stage: Stage-13
Conditional Operator
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: acl
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Filter Operator
predicate: listing_url is not null (type: boolean)
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: httpstatuscode (type: string), accesstime
(type: timestamp), useragent (type: string), page_id (type: string),
listing_url (type: string), requesttime (type: string), upstreamheadertime
(type: string), upstreamresponsetime (type: string), httpverb (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col4 (type: string)
sort order: +
Map-reduce partition columns: _col4 (type: string)
Statistics: Num rows: 1689960 Data size: 1419567234 Basic
stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
timestamp), _col2 (type: string), _col3 (type: string), _col5 (type:
string), _col6 (type: string), _col7 (type: string), _col8 (type: string)
TableScan
alias: pra
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Filter Operator
predicate: url is not null (type: boolean)
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), primary_listing_id (type:
string), public_record_id (type: string), url_type (type: string), url
(type: string), created_at (type: bigint), updated_at (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6
Statistics: Num rows: 19232793 Data size: 9924121600 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col4 (type: string)
sort order: +
Map-reduce partition columns: _col4 (type: string)
Statistics: Num rows: 19232793 Data size: 9924121600
Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
string), _col2 (type: string), _col3 (type: string), _col5 (type: bigint),
_col6 (type: bigint)
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col4 (type: string)
1 _col4 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16
Statistics: Num rows: 21156072 Data size: 10916533996 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-10
Conditional Operator
Stage: Stage-15
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_2:ml
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_2:ml
TableScan
alias: ml
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), mls_id (type: int),
standard_status (type: int), updated_at (type: bigint), created_at (type:
bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col11 (type: string)
1 _col0 (type: string)
Stage: Stage-9
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col11 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col19, _col20, _col21
Statistics: Num rows: 185838144 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Local Work:
Map Reduce Local Work
Stage: Stage-14
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_3:a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_3:a
TableScan
alias: a
Statistics: Num rows: 493 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int), name (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 493 Data size: 51322 Basic stats:
COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col19 (type: int)
1 _col0 (type: int)
Stage: Stage-8
Map Reduce
Map Operator Tree:
TableScan
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col19 (type: int)
1 _col0 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col20, _col21, _col23
Statistics: Num rows: 204421962 Data size: 25348323546 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col14 (type: string), _col13 (type: string),
CAST( _col15 AS TIMESTAMP) (type: timestamp), CAST( _col16 AS TIMESTAMP)
(type: timestamp), CAST( _col21 AS TIMESTAMP) (type: timestamp), CAST(
_col20 AS TIMESTAMP) (type: timestamp), _col23 (type: string), _col12
(type: string), _col1 (type: timestamp), _col10 (type: string), _col11
(type: string), datediff(_col1, CAST( _col21 AS TIMESTAMP)) (type: int),
datediff(_col1, CAST( _col20 AS TIMESTAMP)) (type: int), datediff(_col1,
CAST( _col15 AS TIMESTAMP)) (type: int), datediff(_col1, CAST( _col16 AS
TIMESTAMP)) (type: int), CASE WHEN ((_col12 is not null and (trim(_col12)
<> ''))) THEN (CASE WHEN ((_col11 is null or (trim(_col11) = ''))) THEN
('PR') ELSE ('PRMLS') END) ELSE (CASE WHEN ((_col11 is not null and
(trim(_col11) <> ''))) THEN ('MLS') ELSE ('UNKNOWN') END) END (type:
string), _col0 (type: string), _col8 (type: string), _col5 (type: string),
_col6 (type: string), _col7 (type: string), _col3 (type: string), _col2
(type: string), substring(split(_col14, '/')[0], 0, (length(split(_col14,
'/')[0]) - 3)) (type: string), substring(split(_col14, '/')[0],
(length(split(_col14, '/')[0]) - 1), 2) (type: string), _col18 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14,
_col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23,
_col24, _col25
Statistics: Num rows: 204421962 Data size: 25348323546
Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 204421962 Data size: 25348323546
Basic stats: COMPLETE Column stats: NONE
table:
input format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
output format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
serde:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
name: access_logs.crawlstats_dpp
Local Work:
Map Reduce Local Work
Stage: Stage-0
Move Operator
tables:
partition:
day 2016-10-23
replace: false
table:
input format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
output format:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
serde:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
name: access_logs.crawlstats_dpp
Stage: Stage-4
Stats-Aggr Operator
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
Reduce Output Operator
key expressions: _col11 (type: string)
sort order: +
Map-reduce partition columns: _col11 (type: string)
Statistics: Num rows: 21156072 Data size: 10916533996 Basic
stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
timestamp), _col2 (type: string), _col3 (type: string), _col5 (type:
string), _col6 (type: string), _col7 (type: string), _col8 (type: string),
_col10 (type: string), _col12 (type: string), _col13 (type: string), _col14
(type: string), _col15 (type: bigint), _col16 (type: bigint)
TableScan
alias: ml
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: string), mls_id (type: int),
standard_status (type: int), updated_at (type: bigint), created_at (type:
bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 168943764 Data size: 20949026816 Basic
stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 168943764 Data size: 20949026816
Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int),
_col3 (type: bigint), _col4 (type: bigint)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col11 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8, _col10, _col11, _col12, _col13, _col14, _col15, _col16,
_col18, _col19, _col20, _col21
Statistics: Num rows: 185838144 Data size: 23043929997 Basic
stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Regards,
Satyajit.
On Wed, Oct 26, 2016 at 2:14 PM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:
> Hi Eugene,
>
> PFB Transaction table in green and parquet tables in yellow,
>
> INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
> pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
> as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
>
> Any clue, or something that you would want me to focus on to debug the
> issue.
>
> Regards,
> Satyajit.
>
>
>
> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
> wrote:
>
>> Which of your tables are are transactional? Can you provide the DDL?
>>
>> I don’t think “File does not exist” error is causing your queries to
>> fail. It’s an INFO level msg.
>> There should be some other error.
>>
>> Eugene
>>
>>
>> From: satyajit vegesna <sa...@gmail.com>
>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
>> Date: Tuesday, October 25, 2016 at 5:46 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org"
>> <de...@hive.apache.org>
>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> HI All,
>>
>> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
>> like simple insert,
>>
>> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;set
>> hive.exec.max.dynamic.partitions.pernode=400;set
>> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
>> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
>> mapred.tasktracker.reduce.tasks.maximum=20;set
>> mapred.reduce.child.java.opts=-Xmx2048m;set
>> mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true;
>> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set
>> hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set
>> mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
>> hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO
>> access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as
>> prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>> then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>> acl.httpstatuscode, acl.httpverb, acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
>> AS user_agent, substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
>> pra.url left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
>> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>>
>> i finally end up getting below error,
>>
>> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
>> 1478.96 sec
>> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
>> 1636.58 sec
>> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
>> 1764.97 sec
>> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
>> 1879.61 sec
>> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
>> 2051.38 sec
>> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
>> 2151.49 sec
>> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
>> 2179.67 sec
>> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
>> 2187.86 sec
>> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
>> 2205.22 sec
>> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative
>> CPU 2241.25 sec
>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>> Ended Job = job_1477437520637_0009
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>> LoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>> pl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>> jLoggerFactory]
>> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
>> memory = 514850816
>> Execution failed with exit status: 2
>> Obtaining error information
>>
>> Task failed!
>> Task ID:
>> Stage-14
>>
>> Logs:
>>
>> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec
>> .mr.MapredLocalTask
>> MapReduce Jobs Launched:
>> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
>> msec
>>
>> Could not find any errors in logs, but when i check namenode logs , oi
>> get the following error,
>>
>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>> Retry#0: java.io.FileNotFoundException: File does not exist:
>> /user/hive/warehouse/mls_public_record_association_snapshot_
>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>
>> i also search for find the flush_length files in the above mentioned
>> location, but i only see buckets but no files ending with flush_length.
>>
>> Any clue or help would be highly appreciated.
>>
>> Regards,
>> Satyajit.
>>
>>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by Eugene Koifman <ek...@hortonworks.com>.
If you can run this, then it’s safe to ignore “00094_flush_length” messages and the issue is somewhere else
select count(*) from mls_public_record_association_snapshot_orc pra left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id<http://ml.id/> left outer join attribute a on a.id<http://a.id/> = ml.standard_status
Eugene
From: satyajit vegesna <sa...@gmail.com>>
Date: Wednesday, October 26, 2016 at 2:14 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, Eugene Koifman <ek...@hortonworks.com>>
Cc: "dev@hive.apache.org<ma...@hive.apache.org>" <de...@hive.apache.org>>
Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr execution engine.
Hi Eugene,
PFB Transaction table in green and parquet tables in yellow,
INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS timestamp) as mlUpdated, a.name<http://a.name/> as status, pra.public_record_id as prId, acl.accesstime as crawledon, pra.id<http://pra.id/> as propId, pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') then (case when (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime, acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent AS user_agent, substring(split(pra.url,'/')[0], 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner join mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id<http://ml.id/> left outer join attribute a on a.id<http://a.id/> = ml.standard_status WHERE acl.accesstimedate="2016-10-23";
Any clue, or something that you would want me to focus on to debug the issue.
Regards,
Satyajit.
On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>> wrote:
Which of your tables are are transactional? Can you provide the DDL?
I don’t think “File does not exist” error is causing your queries to fail. It’s an INFO level msg.
There should be some other error.
Eugene
From: satyajit vegesna <sa...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Tuesday, October 25, 2016 at 5:46 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "dev@hive.apache.org<ma...@hive.apache.org>" <de...@hive.apache.org>>
Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr execution engine.
HI All,
i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries like simple insert,
set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;set hive.exec.max.dynamic.partitions.pernode=400;set hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set mapred.tasktracker.reduce.tasks.maximum=20;set mapred.reduce.child.java.opts=-Xmx2048m;set mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS timestamp) as mlUpdated, a.name<http://a.name> as status, pra.public_record_id as prId, acl.accesstime as crawledon, pra.id<http://pra.id> as propId, pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') then (case when (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime, acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent AS user_agent, substring(split(pra.url,'/')[0], 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner join mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id<http://ml.id> left outer join attribute a on a.id<http://a.id> = ml.standard_status WHERE acl.accesstimedate="2016-10-23";
i finally end up getting below error,
2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU 1478.96 sec
2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU 1636.58 sec
2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU 1764.97 sec
2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU 1879.61 sec
2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU 2051.38 sec
2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU 2151.49 sec
2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU 2179.67 sec
2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU 2187.86 sec
2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU 2205.22 sec
2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2241.25 sec
MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
Ended Job = job_1477437520637_0009
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-25 17:40:35Starting to launch local task to process map join;maximum memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-14
Logs:
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS Read: 8060555201 HDFS Write: 757253756 SUCCESS
Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS Read: 2407914653<tel:2407914653> HDFS Write: 805874953 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360 msec
Could not find any errors in logs, but when i check namenode logs , oi get the following error,
2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47114<http://192.168.120.133:47114> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43008<http://192.168.120.132:43008> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47260<http://192.168.120.133:47260> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43090<http://192.168.120.132:43090> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37444<http://192.168.120.134:37444> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47300<http://192.168.120.133:47300> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37540<http://192.168.120.134:37540> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47378<http://192.168.120.133:47378> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
i also search for find the flush_length files in the above mentioned location, but i only see buckets but no files ending with flush_length.
Any clue or help would be highly appreciated.
Regards,
Satyajit.
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
PFB Transaction table in green and parquet tables in yellow,
INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
(pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
then (case when (pra.primary_listing_id is null or
TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
<> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
acl.httpstatuscode, acl.httpverb, acl.requesttime,
acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
AS user_agent, substring(split(pra.url,'/')[0],
0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
access_logs.loadbalancer_accesslogs acl inner join
mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url
left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
ml.id left outer join attribute a on a.id = ml.standard_status WHERE
acl.accesstimedate="2016-10-23";
Any clue, or something that you would want me to focus on to debug the
issue.
Regards,
Satyajit.
On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
wrote:
> Which of your tables are are transactional? Can you provide the DDL?
>
> I don’t think “File does not exist” error is causing your queries to
> fail. It’s an INFO level msg.
> There should be some other error.
>
> Eugene
>
>
> From: satyajit vegesna <sa...@gmail.com>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
> Date: Tuesday, October 25, 2016 at 5:46 PM
> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org" <
> dev@hive.apache.org>
> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
> execution engine.
>
> HI All,
>
> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
> like simple insert,
>
> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
> hive.exec.dynamic.partition.mode=nonstrict;set hive.exec.max.dynamic.partitions.pernode=400;set
> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
> mapred.tasktracker.reduce.tasks.maximum=20;set
> mapred.reduce.child.java.opts=-Xmx2048m;set mapred.map.child.java.opts=-Xmx2048m;
> set hive.support.concurrency=true; set hive.txn.manager=org.apache.
> hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false;
> set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set
> hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT
> INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url
> as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
> i finally end up getting below error,
>
> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
> 1478.96 sec
> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
> 1636.58 sec
> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
> 1764.97 sec
> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
> 1879.61 sec
> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
> 2051.38 sec
> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
> 2151.49 sec
> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
> 2179.67 sec
> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
> 2187.86 sec
> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
> 2205.22 sec
> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU
> 2241.25 sec
> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
> Ended Job = job_1477437520637_0009
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.
> 1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.
> Log4jLoggerFactory]
> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
> memory = 514850816
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
> Task ID:
> Stage-14
>
> Logs:
>
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapredLocalTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
> Read: 8060555201 HDFS Write: 757253756 SUCCESS
> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
> Read: 2407914653 HDFS Write: 805874953 SUCCESS
> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
> msec
>
> Could not find any errors in logs, but when i check namenode logs , oi get
> the following error,
>
> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.132:43008 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47260 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.132:43090 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.134:37444 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47300 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.134:37540 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47378 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>
> i also search for find the flush_length files in the above mentioned
> location, but i only see buckets but no files ending with flush_length.
>
> Any clue or help would be highly appreciated.
>
> Regards,
> Satyajit.
>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by satyajit vegesna <sa...@gmail.com>.
Hi Eugene,
PFB Transaction table in green and parquet tables in yellow,
INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select
pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp)
as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
(pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
then (case when (pra.primary_listing_id is null or
TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
<> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
acl.httpstatuscode, acl.httpverb, acl.requesttime,
acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
AS user_agent, substring(split(pra.url,'/')[0],
0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
access_logs.loadbalancer_accesslogs acl inner join
mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url
left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
ml.id left outer join attribute a on a.id = ml.standard_status WHERE
acl.accesstimedate="2016-10-23";
Any clue, or something that you would want me to focus on to debug the
issue.
Regards,
Satyajit.
On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ek...@hortonworks.com>
wrote:
> Which of your tables are are transactional? Can you provide the DDL?
>
> I don’t think “File does not exist” error is causing your queries to
> fail. It’s an INFO level msg.
> There should be some other error.
>
> Eugene
>
>
> From: satyajit vegesna <sa...@gmail.com>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
> Date: Tuesday, October 25, 2016 at 5:46 PM
> To: "user@hive.apache.org" <us...@hive.apache.org>, "dev@hive.apache.org" <
> dev@hive.apache.org>
> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
> execution engine.
>
> HI All,
>
> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries
> like simple insert,
>
> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set
> hive.exec.dynamic.partition.mode=nonstrict;set hive.exec.max.dynamic.partitions.pernode=400;set
> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set
> mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set
> mapred.tasktracker.reduce.tasks.maximum=20;set
> mapred.reduce.child.java.opts=-Xmx2048m;set mapred.map.child.java.opts=-Xmx2048m;
> set hive.support.concurrency=true; set hive.txn.manager=org.apache.
> hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false;
> set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set
> hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT
> INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url
> as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS
> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when
> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
> then (case when (pra.primary_listing_id is null or
> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case
> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id)
> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
> acl.httpstatuscode, acl.httpverb, acl.requesttime,
> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent
> AS user_agent, substring(split(pra.url,'/')[0],
> 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0],
> length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM
> access_logs.loadbalancer_accesslogs acl inner join
> mls_public_record_association_snapshot_orc pra on acl.listing_url =
> pra.url left outer join mls_listing_snapshot_orc ml on
> pra.primary_listing_id = ml.id left outer join attribute a on a.id =
> ml.standard_status WHERE acl.accesstimedate="2016-10-23";
>
> i finally end up getting below error,
>
> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU
> 1478.96 sec
> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU
> 1636.58 sec
> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU
> 1764.97 sec
> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU
> 1879.61 sec
> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU
> 2051.38 sec
> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU
> 2151.49 sec
> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU
> 2179.67 sec
> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU
> 2187.86 sec
> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU
> 2205.22 sec
> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU
> 2241.25 sec
> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
> Ended Job = job_1477437520637_0009
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.
> 1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.
> Log4jLoggerFactory]
> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum
> memory = 514850816
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
> Task ID:
> Stage-14
>
> Logs:
>
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapredLocalTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS
> Read: 8060555201 HDFS Write: 757253756 SUCCESS
> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS
> Read: 2407914653 HDFS Write: 805874953 SUCCESS
> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360
> msec
>
> Could not find any errors in logs, but when i check namenode logs , oi get
> the following error,
>
> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.132:43008 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47260 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.132:43090 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.134:37444 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47300 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.134:37540 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations
> from 192.168.120.133:47378 Call#4 Retry#0: java.io.FileNotFoundException:
> File does not exist: /user/hive/warehouse/mls_public_record_association_
> snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>
> i also search for find the flush_length files in the above mentioned
> location, but i only see buckets but no files ending with flush_length.
>
> Any clue or help would be highly appreciated.
>
> Regards,
> Satyajit.
>
>
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by Eugene Koifman <ek...@hortonworks.com>.
Which of your tables are are transactional? Can you provide the DDL?
I don’t think “File does not exist” error is causing your queries to fail. It’s an INFO level msg.
There should be some other error.
Eugene
From: satyajit vegesna <sa...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Tuesday, October 25, 2016 at 5:46 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "dev@hive.apache.org<ma...@hive.apache.org>" <de...@hive.apache.org>>
Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr execution engine.
HI All,
i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries like simple insert,
set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;set hive.exec.max.dynamic.partitions.pernode=400;set hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set mapred.tasktracker.reduce.tasks.maximum=20;set mapred.reduce.child.java.opts=-Xmx2048m;set mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS timestamp) as mlUpdated, a.name<http://a.name> as status, pra.public_record_id as prId, acl.accesstime as crawledon, pra.id<http://pra.id> as propId, pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') then (case when (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime, acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent AS user_agent, substring(split(pra.url,'/')[0], 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner join mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id<http://ml.id> left outer join attribute a on a.id<http://a.id> = ml.standard_status WHERE acl.accesstimedate="2016-10-23";
i finally end up getting below error,
2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU 1478.96 sec
2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU 1636.58 sec
2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU 1764.97 sec
2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU 1879.61 sec
2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU 2051.38 sec
2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU 2151.49 sec
2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU 2179.67 sec
2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU 2187.86 sec
2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU 2205.22 sec
2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2241.25 sec
MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
Ended Job = job_1477437520637_0009
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-25 17:40:35Starting to launch local task to process map join;maximum memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-14
Logs:
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS Read: 8060555201 HDFS Write: 757253756 SUCCESS
Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS Read: 2407914653 HDFS Write: 805874953 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360 msec
Could not find any errors in logs, but when i check namenode logs , oi get the following error,
2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47114<http://192.168.120.133:47114> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43008<http://192.168.120.132:43008> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47260<http://192.168.120.133:47260> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43090<http://192.168.120.132:43090> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37444<http://192.168.120.134:37444> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47300<http://192.168.120.133:47300> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37540<http://192.168.120.134:37540> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47378<http://192.168.120.133:47378> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
i also search for find the flush_length files in the above mentioned location, but i only see buckets but no files ending with flush_length.
Any clue or help would be highly appreciated.
Regards,
Satyajit.
Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
execution engine.
Posted by Eugene Koifman <ek...@hortonworks.com>.
Which of your tables are are transactional? Can you provide the DDL?
I don’t think “File does not exist” error is causing your queries to fail. It’s an INFO level msg.
There should be some other error.
Eugene
From: satyajit vegesna <sa...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Tuesday, October 25, 2016 at 5:46 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "dev@hive.apache.org<ma...@hive.apache.org>" <de...@hive.apache.org>>
Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr execution engine.
HI All,
i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries like simple insert,
set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;set hive.exec.max.dynamic.partitions.pernode=400;set hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set mapreduce.reduce.memory.mb=5120;set mapred.tasktracker.map.tasks.maximum=30;set mapred.tasktracker.reduce.tasks.maximum=20;set mapred.reduce.child.java.opts=-Xmx2048m;set mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS timestamp) as mlUpdated, a.name<http://a.name> as status, pra.public_record_id as prId, acl.accesstime as crawledon, pra.id<http://pra.id> as propId, pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') then (case when (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, acl.httpstatuscode, acl.httpverb, acl.requesttime, acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent AS user_agent, substring(split(pra.url,'/')[0], 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner join mls_public_record_association_snapshot_orc pra on acl.listing_url = pra.url left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id<http://ml.id> left outer join attribute a on a.id<http://a.id> = ml.standard_status WHERE acl.accesstimedate="2016-10-23";
i finally end up getting below error,
2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU 1478.96 sec
2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU 1636.58 sec
2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU 1764.97 sec
2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU 1879.61 sec
2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU 2051.38 sec
2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU 2151.49 sec
2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU 2179.67 sec
2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU 2187.86 sec
2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU 2205.22 sec
2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2241.25 sec
MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
Ended Job = job_1477437520637_0009
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2016-10-25 17:40:35Starting to launch local task to process map join;maximum memory = 514850816
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-14
Logs:
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS Read: 8060555201 HDFS Write: 757253756 SUCCESS
Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS Read: 2407914653 HDFS Write: 805874953 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360 msec
Could not find any errors in logs, but when i check namenode logs , oi get the following error,
2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47114<http://192.168.120.133:47114> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length
2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43008<http://192.168.120.132:43008> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00095_flush_length
2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47260<http://192.168.120.133:47260> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00096_flush_length
2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.132:43090<http://192.168.120.132:43090> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00097_flush_length
2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37444<http://192.168.120.134:37444> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00098_flush_length
2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47300<http://192.168.120.133:47300> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00099_flush_length
2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.134:37540<http://192.168.120.134:37540> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00012_flush_length
2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.120.133:47378<http://192.168.120.133:47378> Call#4 Retry#0: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/mls_public_record_association_snapshot_orc/delta_0000002_0000002_0000/bucket_00075_flush_length
i also search for find the flush_length files in the above mentioned location, but i only see buckets but no files ending with flush_length.
Any clue or help would be highly appreciated.
Regards,
Satyajit.