You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marc Sturlese <ma...@gmail.com> on 2011/02/24 16:12:28 UTC

Check lzo is working on intermediate data

Hey there,
I am using hadoop 0.20.2. I 've successfully installed LZOCompression
following these steps:
https://github.com/kevinweil/hadoop-lzo

I have some MR jobs written with the new API and I want to compress
intermediate data.
Not sure if my mapred-site.xml should have the properties:

  <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
  </property>
  <property>
    <name>mapred.map.output.compression.codec</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
  </property>

or:

  <property>
    <name>mapreduce.map.output.compress</name>
    <value>true</value>
  </property>
  <property>
    <name>mapreduce.map.output.compress.codec</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
  </property>

How can I check that the compression is been applied?

Thanks in advance

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Check-lzo-is-working-on-intermediate-data-tp2567704p2567704.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: Check lzo is working on intermediate data

Posted by Da Zheng <zh...@gmail.com>.
I use the first one, and it seems to work because I see the size of data output
from mappers is much smaller.

Da

On 2/24/11 10:12 AM, Marc Sturlese wrote:
> 
> Hey there,
> I am using hadoop 0.20.2. I 've successfully installed LZOCompression
> following these steps:
> https://github.com/kevinweil/hadoop-lzo
> 
> I have some MR jobs written with the new API and I want to compress
> intermediate data.
> Not sure if my mapred-site.xml should have the properties:
> 
>   <property>
>     <name>mapred.compress.map.output</name>
>     <value>true</value>
>   </property>
>   <property>
>     <name>mapred.map.output.compression.codec</name>
>     <value>com.hadoop.compression.lzo.LzoCodec</value>
>   </property>
> 
> or:
> 
>   <property>
>     <name>mapreduce.map.output.compress</name>
>     <value>true</value>
>   </property>
>   <property>
>     <name>mapreduce.map.output.compress.codec</name>
>     <value>com.hadoop.compression.lzo.LzoCodec</value>
>   </property>
> 
> How can I check that the compression is been applied?
> 
> Thanks in advance
> 


Re: Check lzo is working on intermediate data

Posted by James Seigel <ja...@tynt.com>.
Run a standard job before. Look at the summary data.

Run the job again after the changes and look at the summary.

You should see less file system bytes written from the map stage.
Sorry, might be most obvious in shuffle bytes.

I don't have a terminal in front of me right now.

James

Sent from my mobile. Please excuse the typos.

On 2011-02-24, at 8:22 AM, Marc Sturlese <ma...@gmail.com> wrote:

>
> Hey there,
> I am using hadoop 0.20.2. I 've successfully installed LZOCompression
> following these steps:
> https://github.com/kevinweil/hadoop-lzo
>
> I have some MR jobs written with the new API and I want to compress
> intermediate data.
> Not sure if my mapred-site.xml should have the properties:
>
>  <property>
>    <name>mapred.compress.map.output</name>
>    <value>true</value>
>  </property>
>  <property>
>    <name>mapred.map.output.compression.codec</name>
>    <value>com.hadoop.compression.lzo.LzoCodec</value>
>  </property>
>
> or:
>
>  <property>
>    <name>mapreduce.map.output.compress</name>
>    <value>true</value>
>  </property>
>  <property>
>    <name>mapreduce.map.output.compress.codec</name>
>    <value>com.hadoop.compression.lzo.LzoCodec</value>
>  </property>
>
> How can I check that the compression is been applied?
>
> Thanks in advance
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Check-lzo-is-working-on-intermediate-data-tp2567704p2567704.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.