You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Sanjay Subramanian <Sa...@wizecommerce.com> on 2013/05/21 20:30:37 UTC

Errors using MultipleOutputs and LZO compression

Subject : Errors using MultipleOutputs and LZO compression
============================================
Hi

In Cloudera Manager 4.1.2. , we have defined a <MR-action> in Oozie thru the Hue interface.
This MR action is designed to read GZIP input files (typically 350+ gzip files ranging from 20MB to 200MB gzip size) and output either GZIP or LZO files
In the reducer MultipleOutputs are used to write the output.

Success Usecases
==============
1. mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec
2. mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec

2. mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec
    Reduced set of input gzip files (5-10 gzip files only)

Failure Usecase
============
1. mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec
    350 gzip input files

ERRORs in Logs
=============
2013-05-20 16:05:44,849 ERROR [Thread-2] org.apache.hadoop.hdfs.DFSClient: Failed to close file /user/sasubramanian/impressions/output/outpdir/2013-03-19/0000044-130515165107614-oozie-oozi-W/_temporary/1/_temporary/attempt_1368666339740_5579_r_000011_3/header/2013-03-19/ieeuu3.pv.ie.nextag.com/part-r-00011.lzo<http://ieeuu3.pv.ie.nextag.com/part-r-00011.lzo>
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/sasubramanian/impressions/output/outpdir/2013-03-19/0000044-130515165107614-oozie-oozi-W/_temporary/1/_temporary/attempt_1368666339740_5579_r_000011_3/header/2013-03-19/ieeuu3.pv.ie.nextag.com/part-r-00011.lzo<http://ieeuu3.pv.ie.nextag.com/part-r-00011.lzo> File does not exist. [Lease.  Holder: DFSClient_attempt_1368666339740_5579_r_000011_3_-1369131598_1, pendingcreates: 3]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2308)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2299)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2366)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2343)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:526)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:335)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44084)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

at org.apache.hadoop.ipc.Client.call(Client.java:1160)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.complete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.complete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:329)
at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1769)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1756)
at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:654)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:671)
at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
2013-05-20 16:05:44,850 WARN [Thread-895] org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception

Need your help
thanks


sanjay


CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.