You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by w00t w00t <w0...@yahoo.de> on 2013/08/13 11:26:14 UTC

LZO output compression

Hello,

I am running Hortonworks 1.2 using Hadoop 1.1.2.21 and Hive 0.10.0.21.

I set up LZO compression and can read LZO compressed data without problems.

My next try was to test output compression.
Therefore, I created the following small script:
--------------------------------------------------------------------------------------------------------------------------
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;

DROP TABLE IF EXISTS simple_lzo;

CREATE TABLE simple_lzo
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' AS 
SELECT count(*) 
FROM txt_table_lzo;

----------------------------------------------------------------------------------------------------------------------------

The output gets compressed but with default-codec "deflate" - not with LZO.

Do you know what the problem could be here and how I could debug it?
There are no error messages or so.

Additionally, I also tried the commands for Hadoop 0.20:
mapred.output.compress=true;
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec

That didn't work as well.


In Pig or Java MR, I have no problems to gerneate LZO compressed output.


Thanks

Hiveserver2 Beeline command clarification

Posted by Sanjay Subramanian <Sa...@wizecommerce.com>.

Hi guys

I just hooked up hivservrer2 to ldap. In beeline I realized you can login like the following (don't need to define "org.apache.hive.jdbc.HiveDriver")

beeline> !connect jdbc:hive2://dev-thdp5:10000 sanjay.subramanian@wizecommerce.com
scan complete in 2ms
Connecting to jdbc:hive2://dev-thdp5:10000
Enter password for jdbc:hive2://dev-thdp5:10000: ********
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-cdh4.3.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://dev-thdp5:10000> show tables;
+--------------------------+
|         tab_name         |
+--------------------------+
| keyword_impressions_log  |
+--------------------------+
1 row selected (1.574 seconds)
0: jdbc:hive2://dev-thdp5:10000>

If this is also a correct way to use beeline, then I actually prefer this since the password is not visible

sanjay




CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.

Re: LZO output compression

Posted by Sanjay Subramanian <Sa...@wizecommerce.com>.

Check this class where these are defined
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1/src/mapred/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java

From: w00t w00t <w0...@yahoo.de>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, w00t w00t <w0...@yahoo.de>>
Date: Tuesday, August 13, 2013 2:39 AM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, w00t w00t <w0...@yahoo.de>>
Subject: Re: LZO output compression

Oh, I could get it working using these settings:

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

But I have one question, where maybe on of you can help me with an explaination:
As I am running Hadoop 1.1.* why do I need the old command for Hadoop 0.20?:
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

I supposed the commands for the newer Hadoop versions are:
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;

________________________________
Von: w00t w00t <w0...@yahoo.de>>
An: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Gesendet: 11:26 Dienstag, 13.August 2013
Betreff: LZO output compression

Hello,

I am running Hortonworks 1.2 using Hadoop 1.1.2.21 and Hive 0.10.0.21.

I set up LZO compression and can read LZO compressed data without problems.

My next try was to test output compression.
Therefore, I created the following small script:
--------------------------------------------------------------------------------------------------------------------------
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;

DROP TABLE IF EXISTS simple_lzo;

CREATE TABLE simple_lzo
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' AS
SELECT count(*)
FROM txt_table_lzo;
----------------------------------------------------------------------------------------------------------------------------
The output gets compressed but with default-codec "deflate" - not with LZO.

Do you know what the problem could be here and how I could debug it?
There are no error messages or so.

Additionally, I also tried the commands for Hadoop 0.20:
mapred.output.compress=true;
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec

That didn't work as well.

In Pig or Java MR, I have no problems to gerneate LZO compressed output.

Thanks

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.

Re: LZO output compression

Posted by w00t w00t <w0...@yahoo.de>.

Oh, I could get it working using these settings:

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;


But I have one question, where maybe on of you can help me with an explaination:
As I am running Hadoop 1.1.* why do I need the old command for Hadoop 0.20?:
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;


I supposed the commands for the newer Hadoop versions are:
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;



________________________________
 Von: w00t w00t <w0...@yahoo.de>
An: "user@hive.apache.org" <us...@hive.apache.org> 
Gesendet: 11:26 Dienstag, 13.August 2013
Betreff: LZO output compression
 


Hello,

I am running Hortonworks 1.2 using Hadoop 1.1.2.21 and Hive 0.10.0.21.

I set up LZO compression and can read LZO compressed data without problems.

My next try was to test output compression.
Therefore, I created the following small script:
--------------------------------------------------------------------------------------------------------------------------
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;

DROP TABLE IF EXISTS simple_lzo;

CREATE TABLE simple_lzo
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' AS 
SELECT count(*) 
FROM
 txt_table_lzo;

----------------------------------------------------------------------------------------------------------------------------

The output gets compressed but with default-codec "deflate" - not with LZO.

Do you know what the problem could be here and how I could debug it?
There are no error messages or so.

Additionally, I also tried the commands for Hadoop 0.20:
mapred.output.compress=true;
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec

That didn't work as well.


In Pig or Java MR, I have no problems to gerneate LZO compressed output.


Thanks