You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by KingDavies <ki...@gmail.com> on 2014/03/06 11:29:35 UTC

MR2 Job over LZO data

Running on Hadoop 2.2.0

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.
But when using the LZO format the job fails:
import com.hadoop.mapreduce.LzoTextInputFormat;
job.setInputFormatClass(LzoTextInputFormat.class);

Dependencies from the maven repository:
http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
Also tried with elephant-bird-core 4.4

The same data can be queried fine from within Hive(0.12) on the same
cluster.


The exception:
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at com.cloudreach.DataQuality.Main.main(Main.java:42)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

Thanks

Re: MR2 Job over LZO data

Posted by Stanley Shi <ss...@gopivotal.com>.
May be you can try download the LZO class and rebuild it against Hadoop
2.2.0;
If build success, you should be good to go;
if failed, then maybe you need to wait for the LZO guys to update their
code.

Regards,
*Stanley Shi,*



On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>

Re: MR2 Job over LZO data

Posted by Stanley Shi <ss...@gopivotal.com>.
May be you can try download the LZO class and rebuild it against Hadoop
2.2.0;
If build success, you should be good to go;
if failed, then maybe you need to wait for the LZO guys to update their
code.

Regards,
*Stanley Shi,*



On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
Can you run MR jobs (not pig job) which takes Lzo Files as input ?

If you can not run MR jobs. You may want to check the lzo compression
configuration in core-site.xml. Make sure the dynamic library is in
HADOOP_HOME/lib/native/

Here is a FAQ about how to configure lzo
https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1






On Sat, Mar 8, 2014 at 12:04 AM, Viswanathan J
<ja...@gmail.com>wrote:

> Hi,
>
> Getting the below error while running pig job in hadoop-2.x,
>
> Caused by: java.io.IOException: No codec for file found
> 2639   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
> 2640   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
> 2641   at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)
>
> Have copied the respective lzo jars to lib folders, but facing this issue.
>
> pls help.
>
>
>
> On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
> german.fl@samsung.com> wrote:
>
>> King
>>
>> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>>
>>
>>
>> I hope this helps
>>
>>
>>
>> ./g
>>
>>
>>
>>
>>
>> *Where to get Hadoop LZO*
>>
>> https://github.com/twitter/hadoop-lzo
>>
>>
>>
>>
>> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>>
>>
>>
>> *Requirements*
>>
>> On cents:
>>
>> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>>
>>
>>
>> On ubuntu:
>>
>> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>>
>>
>>
>> *Clone:*
>>
>> git clone https://github.com/twitter/hadoop-lzo.git
>>
>>
>>
>> Follow instructions on README.md from this github site, basically
>>
>>
>>
>>  cd hadoop-lzo
>>
>> *     mvn clean package  test*
>>
>>
>>
>> *To enable this at run time do:*
>>
>> a.       Copy the library to the hadoop/share/common (if  you don't want
>> to modify classpaths by putting the library somewhere else)
>>
>>
>>
>> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
>> hadoop/share/hadoop/common/
>>
>>
>>
>> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>>
>>
>>
>>
>>
>> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
>> *Sent:* Thursday, March 06, 2014 11:50 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR2 Job over LZO data
>>
>>
>>
>> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>>
>>
>>
>> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>>
>>
>>
>> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>>
>> Running on Hadoop 2.2.0
>>
>>
>>
>> The Java MR2 job works as expected on an uncompressed data source using
>> the TextInputFormat.class.
>>
>> But when using the LZO format the job fails:
>>
>> import com.hadoop.mapreduce.LzoTextInputFormat;
>>
>> job.setInputFormatClass(LzoTextInputFormat.class);
>>
>>
>>
>> Dependencies from the maven repository:
>>
>> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>>
>> Also tried with elephant-bird-core 4.4
>>
>>
>>
>> The same data can be queried fine from within Hive(0.12) on the same
>> cluster.
>>
>>
>>
>>
>>
>> The exception:
>>
>> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
>> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>>
>> at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>>
>> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>
>>
>>
>> I believe the issue is related to the changes in Hadoop 2, but where can
>> I find a H2 compatible version?
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> --
>>
>> Regards
>>
>> Gordon Wang
>>
>
>
>
> --
> Regards,
> Viswa.J
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
Can you run MR jobs (not pig job) which takes Lzo Files as input ?

If you can not run MR jobs. You may want to check the lzo compression
configuration in core-site.xml. Make sure the dynamic library is in
HADOOP_HOME/lib/native/

Here is a FAQ about how to configure lzo
https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1






On Sat, Mar 8, 2014 at 12:04 AM, Viswanathan J
<ja...@gmail.com>wrote:

> Hi,
>
> Getting the below error while running pig job in hadoop-2.x,
>
> Caused by: java.io.IOException: No codec for file found
> 2639   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
> 2640   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
> 2641   at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)
>
> Have copied the respective lzo jars to lib folders, but facing this issue.
>
> pls help.
>
>
>
> On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
> german.fl@samsung.com> wrote:
>
>> King
>>
>> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>>
>>
>>
>> I hope this helps
>>
>>
>>
>> ./g
>>
>>
>>
>>
>>
>> *Where to get Hadoop LZO*
>>
>> https://github.com/twitter/hadoop-lzo
>>
>>
>>
>>
>> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>>
>>
>>
>> *Requirements*
>>
>> On cents:
>>
>> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>>
>>
>>
>> On ubuntu:
>>
>> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>>
>>
>>
>> *Clone:*
>>
>> git clone https://github.com/twitter/hadoop-lzo.git
>>
>>
>>
>> Follow instructions on README.md from this github site, basically
>>
>>
>>
>>  cd hadoop-lzo
>>
>> *     mvn clean package  test*
>>
>>
>>
>> *To enable this at run time do:*
>>
>> a.       Copy the library to the hadoop/share/common (if  you don't want
>> to modify classpaths by putting the library somewhere else)
>>
>>
>>
>> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
>> hadoop/share/hadoop/common/
>>
>>
>>
>> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>>
>>
>>
>>
>>
>> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
>> *Sent:* Thursday, March 06, 2014 11:50 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR2 Job over LZO data
>>
>>
>>
>> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>>
>>
>>
>> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>>
>>
>>
>> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>>
>> Running on Hadoop 2.2.0
>>
>>
>>
>> The Java MR2 job works as expected on an uncompressed data source using
>> the TextInputFormat.class.
>>
>> But when using the LZO format the job fails:
>>
>> import com.hadoop.mapreduce.LzoTextInputFormat;
>>
>> job.setInputFormatClass(LzoTextInputFormat.class);
>>
>>
>>
>> Dependencies from the maven repository:
>>
>> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>>
>> Also tried with elephant-bird-core 4.4
>>
>>
>>
>> The same data can be queried fine from within Hive(0.12) on the same
>> cluster.
>>
>>
>>
>>
>>
>> The exception:
>>
>> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
>> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>>
>> at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>>
>> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>
>>
>>
>> I believe the issue is related to the changes in Hadoop 2, but where can
>> I find a H2 compatible version?
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> --
>>
>> Regards
>>
>> Gordon Wang
>>
>
>
>
> --
> Regards,
> Viswa.J
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
Can you run MR jobs (not pig job) which takes Lzo Files as input ?

If you can not run MR jobs. You may want to check the lzo compression
configuration in core-site.xml. Make sure the dynamic library is in
HADOOP_HOME/lib/native/

Here is a FAQ about how to configure lzo
https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1






On Sat, Mar 8, 2014 at 12:04 AM, Viswanathan J
<ja...@gmail.com>wrote:

> Hi,
>
> Getting the below error while running pig job in hadoop-2.x,
>
> Caused by: java.io.IOException: No codec for file found
> 2639   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
> 2640   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
> 2641   at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)
>
> Have copied the respective lzo jars to lib folders, but facing this issue.
>
> pls help.
>
>
>
> On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
> german.fl@samsung.com> wrote:
>
>> King
>>
>> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>>
>>
>>
>> I hope this helps
>>
>>
>>
>> ./g
>>
>>
>>
>>
>>
>> *Where to get Hadoop LZO*
>>
>> https://github.com/twitter/hadoop-lzo
>>
>>
>>
>>
>> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>>
>>
>>
>> *Requirements*
>>
>> On cents:
>>
>> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>>
>>
>>
>> On ubuntu:
>>
>> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>>
>>
>>
>> *Clone:*
>>
>> git clone https://github.com/twitter/hadoop-lzo.git
>>
>>
>>
>> Follow instructions on README.md from this github site, basically
>>
>>
>>
>>  cd hadoop-lzo
>>
>> *     mvn clean package  test*
>>
>>
>>
>> *To enable this at run time do:*
>>
>> a.       Copy the library to the hadoop/share/common (if  you don't want
>> to modify classpaths by putting the library somewhere else)
>>
>>
>>
>> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
>> hadoop/share/hadoop/common/
>>
>>
>>
>> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>>
>>
>>
>>
>>
>> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
>> *Sent:* Thursday, March 06, 2014 11:50 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR2 Job over LZO data
>>
>>
>>
>> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>>
>>
>>
>> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>>
>>
>>
>> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>>
>> Running on Hadoop 2.2.0
>>
>>
>>
>> The Java MR2 job works as expected on an uncompressed data source using
>> the TextInputFormat.class.
>>
>> But when using the LZO format the job fails:
>>
>> import com.hadoop.mapreduce.LzoTextInputFormat;
>>
>> job.setInputFormatClass(LzoTextInputFormat.class);
>>
>>
>>
>> Dependencies from the maven repository:
>>
>> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>>
>> Also tried with elephant-bird-core 4.4
>>
>>
>>
>> The same data can be queried fine from within Hive(0.12) on the same
>> cluster.
>>
>>
>>
>>
>>
>> The exception:
>>
>> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
>> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>>
>> at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>>
>> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>
>>
>>
>> I believe the issue is related to the changes in Hadoop 2, but where can
>> I find a H2 compatible version?
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> --
>>
>> Regards
>>
>> Gordon Wang
>>
>
>
>
> --
> Regards,
> Viswa.J
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
Can you run MR jobs (not pig job) which takes Lzo Files as input ?

If you can not run MR jobs. You may want to check the lzo compression
configuration in core-site.xml. Make sure the dynamic library is in
HADOOP_HOME/lib/native/

Here is a FAQ about how to configure lzo
https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1






On Sat, Mar 8, 2014 at 12:04 AM, Viswanathan J
<ja...@gmail.com>wrote:

> Hi,
>
> Getting the below error while running pig job in hadoop-2.x,
>
> Caused by: java.io.IOException: No codec for file found
> 2639   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
> 2640   at
> com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
> 2641   at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)
>
> Have copied the respective lzo jars to lib folders, but facing this issue.
>
> pls help.
>
>
>
> On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
> german.fl@samsung.com> wrote:
>
>> King
>>
>> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>>
>>
>>
>> I hope this helps
>>
>>
>>
>> ./g
>>
>>
>>
>>
>>
>> *Where to get Hadoop LZO*
>>
>> https://github.com/twitter/hadoop-lzo
>>
>>
>>
>>
>> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>>
>>
>>
>> *Requirements*
>>
>> On cents:
>>
>> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>>
>>
>>
>> On ubuntu:
>>
>> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>>
>>
>>
>> *Clone:*
>>
>> git clone https://github.com/twitter/hadoop-lzo.git
>>
>>
>>
>> Follow instructions on README.md from this github site, basically
>>
>>
>>
>>  cd hadoop-lzo
>>
>> *     mvn clean package  test*
>>
>>
>>
>> *To enable this at run time do:*
>>
>> a.       Copy the library to the hadoop/share/common (if  you don't want
>> to modify classpaths by putting the library somewhere else)
>>
>>
>>
>> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
>> hadoop/share/hadoop/common/
>>
>>
>>
>> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>>
>>
>>
>>
>>
>> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
>> *Sent:* Thursday, March 06, 2014 11:50 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: MR2 Job over LZO data
>>
>>
>>
>> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>>
>>
>>
>> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>>
>>
>>
>> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>>
>> Running on Hadoop 2.2.0
>>
>>
>>
>> The Java MR2 job works as expected on an uncompressed data source using
>> the TextInputFormat.class.
>>
>> But when using the LZO format the job fails:
>>
>> import com.hadoop.mapreduce.LzoTextInputFormat;
>>
>> job.setInputFormatClass(LzoTextInputFormat.class);
>>
>>
>>
>> Dependencies from the maven repository:
>>
>> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>>
>> Also tried with elephant-bird-core 4.4
>>
>>
>>
>> The same data can be queried fine from within Hive(0.12) on the same
>> cluster.
>>
>>
>>
>>
>>
>> The exception:
>>
>> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
>> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>>
>> at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>>
>> at
>> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>>
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>>
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>>
>> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>
>>
>>
>> I believe the issue is related to the changes in Hadoop 2, but where can
>> I find a H2 compatible version?
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> --
>>
>> Regards
>>
>> Gordon Wang
>>
>
>
>
> --
> Regards,
> Viswa.J
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Getting the below error while running pig job in hadoop-2.x,

Caused by: java.io.IOException: No codec for file found
2639   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
2640   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
2641   at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)

Have copied the respective lzo jars to lib folders, but facing this issue.

pls help.



On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> King
>
> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>
>
>
> I hope this helps
>
>
>
> ./g
>
>
>
>
>
> *Where to get Hadoop LZO*
>
> https://github.com/twitter/hadoop-lzo
>
>
>
>
> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>
>
>
> *Requirements*
>
> On cents:
>
> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>
>
>
> On ubuntu:
>
> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>
>
>
> *Clone:*
>
> git clone https://github.com/twitter/hadoop-lzo.git
>
>
>
> Follow instructions on README.md from this github site, basically
>
>
>
>  cd hadoop-lzo
>
> *     mvn clean package  test*
>
>
>
> *To enable this at run time do:*
>
> a.       Copy the library to the hadoop/share/common (if  you don't want
> to modify classpaths by putting the library somewhere else)
>
>
>
> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
> hadoop/share/hadoop/common/
>
>
>
> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>
>
>
>
>
> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
> *Sent:* Thursday, March 06, 2014 11:50 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR2 Job over LZO data
>
>
>
> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>
>
>
> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>
>
>
> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>
> Running on Hadoop 2.2.0
>
>
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
>
> But when using the LZO format the job fails:
>
> import com.hadoop.mapreduce.LzoTextInputFormat;
>
> job.setInputFormatClass(LzoTextInputFormat.class);
>
>
>
> Dependencies from the maven repository:
>
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>
> Also tried with elephant-bird-core 4.4
>
>
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
>
>
>
> The exception:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>
> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
>
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
>
>
> Thanks
>
>
>
>
>
> --
>
> Regards
>
> Gordon Wang
>



-- 
Regards,
Viswa.J

Re: MR2 Job over LZO data

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Getting the below error while running pig job in hadoop-2.x,

Caused by: java.io.IOException: No codec for file found
2639   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
2640   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
2641   at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)

Have copied the respective lzo jars to lib folders, but facing this issue.

pls help.



On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> King
>
> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>
>
>
> I hope this helps
>
>
>
> ./g
>
>
>
>
>
> *Where to get Hadoop LZO*
>
> https://github.com/twitter/hadoop-lzo
>
>
>
>
> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>
>
>
> *Requirements*
>
> On cents:
>
> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>
>
>
> On ubuntu:
>
> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>
>
>
> *Clone:*
>
> git clone https://github.com/twitter/hadoop-lzo.git
>
>
>
> Follow instructions on README.md from this github site, basically
>
>
>
>  cd hadoop-lzo
>
> *     mvn clean package  test*
>
>
>
> *To enable this at run time do:*
>
> a.       Copy the library to the hadoop/share/common (if  you don't want
> to modify classpaths by putting the library somewhere else)
>
>
>
> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
> hadoop/share/hadoop/common/
>
>
>
> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>
>
>
>
>
> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
> *Sent:* Thursday, March 06, 2014 11:50 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR2 Job over LZO data
>
>
>
> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>
>
>
> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>
>
>
> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>
> Running on Hadoop 2.2.0
>
>
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
>
> But when using the LZO format the job fails:
>
> import com.hadoop.mapreduce.LzoTextInputFormat;
>
> job.setInputFormatClass(LzoTextInputFormat.class);
>
>
>
> Dependencies from the maven repository:
>
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>
> Also tried with elephant-bird-core 4.4
>
>
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
>
>
>
> The exception:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>
> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
>
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
>
>
> Thanks
>
>
>
>
>
> --
>
> Regards
>
> Gordon Wang
>



-- 
Regards,
Viswa.J

Re: MR2 Job over LZO data

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Getting the below error while running pig job in hadoop-2.x,

Caused by: java.io.IOException: No codec for file found
2639   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
2640   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
2641   at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)

Have copied the respective lzo jars to lib folders, but facing this issue.

pls help.



On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> King
>
> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>
>
>
> I hope this helps
>
>
>
> ./g
>
>
>
>
>
> *Where to get Hadoop LZO*
>
> https://github.com/twitter/hadoop-lzo
>
>
>
>
> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>
>
>
> *Requirements*
>
> On cents:
>
> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>
>
>
> On ubuntu:
>
> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>
>
>
> *Clone:*
>
> git clone https://github.com/twitter/hadoop-lzo.git
>
>
>
> Follow instructions on README.md from this github site, basically
>
>
>
>  cd hadoop-lzo
>
> *     mvn clean package  test*
>
>
>
> *To enable this at run time do:*
>
> a.       Copy the library to the hadoop/share/common (if  you don't want
> to modify classpaths by putting the library somewhere else)
>
>
>
> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
> hadoop/share/hadoop/common/
>
>
>
> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>
>
>
>
>
> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
> *Sent:* Thursday, March 06, 2014 11:50 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR2 Job over LZO data
>
>
>
> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>
>
>
> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>
>
>
> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>
> Running on Hadoop 2.2.0
>
>
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
>
> But when using the LZO format the job fails:
>
> import com.hadoop.mapreduce.LzoTextInputFormat;
>
> job.setInputFormatClass(LzoTextInputFormat.class);
>
>
>
> Dependencies from the maven repository:
>
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>
> Also tried with elephant-bird-core 4.4
>
>
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
>
>
>
> The exception:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>
> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
>
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
>
>
> Thanks
>
>
>
>
>
> --
>
> Regards
>
> Gordon Wang
>



-- 
Regards,
Viswa.J

Re: MR2 Job over LZO data

Posted by Viswanathan J <ja...@gmail.com>.
Hi,

Getting the below error while running pig job in hadoop-2.x,

Caused by: java.io.IOException: No codec for file found
2639   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176)
2640   at
com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88)
2641   at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)

Have copied the respective lzo jars to lib folders, but facing this issue.

pls help.



On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> King
>
> Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
>
>
>
> I hope this helps
>
>
>
> ./g
>
>
>
>
>
> *Where to get Hadoop LZO*
>
> https://github.com/twitter/hadoop-lzo
>
>
>
>
> http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html
>
>
>
> *Requirements*
>
> On cents:
>
> sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2....
>
>
>
> On ubuntu:
>
> sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2
>
>
>
> *Clone:*
>
> git clone https://github.com/twitter/hadoop-lzo.git
>
>
>
> Follow instructions on README.md from this github site, basically
>
>
>
>  cd hadoop-lzo
>
> *     mvn clean package  test*
>
>
>
> *To enable this at run time do:*
>
> a.       Copy the library to the hadoop/share/common (if  you don't want
> to modify classpaths by putting the library somewhere else)
>
>
>
> cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
> hadoop/share/hadoop/common/
>
>
>
> a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/
>
>
>
>
>
> *From:* Gordon Wang [mailto:gwang@gopivotal.com]
> *Sent:* Thursday, March 06, 2014 11:50 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR2 Job over LZO data
>
>
>
> You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0.
>
>
>
> In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0
>
>
>
> On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:
>
> Running on Hadoop 2.2.0
>
>
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
>
> But when using the LZO format the job fails:
>
> import com.hadoop.mapreduce.LzoTextInputFormat;
>
> job.setInputFormatClass(LzoTextInputFormat.class);
>
>
>
> Dependencies from the maven repository:
>
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
>
> Also tried with elephant-bird-core 4.4
>
>
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
>
>
>
> The exception:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
>
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>
> at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
>
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
>
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>
> at com.cloudreach.DataQuality.Main.main(Main.java:42)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
>
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
>
>
> Thanks
>
>
>
>
>
> --
>
> Regards
>
> Gordon Wang
>



-- 
Regards,
Viswa.J

RE: MR2 Job over LZO data

Posted by German Florez-Larrahondo <ge...@samsung.com>.
King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0

 

I hope this helps

 

./g

 

 

Where to get Hadoop LZO

https://github.com/twitter/hadoop-lzo

 

http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo
-compression.html

 

Requirements

On cents:

sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2..

 

On ubuntu: 

sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2   

 

Clone:

git clone https://github.com/twitter/hadoop-lzo.git

 

Follow instructions on README.md from this github site, basically

 

 cd hadoop-lzo

     mvn clean package  test

 

To enable this at run time do:

a.       Copy the library to the hadoop/share/common (if  you don't want to
modify classpaths by putting the library somewhere else)

 

cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
hadoop/share/hadoop/common/

 

a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/

 

 

From: Gordon Wang [mailto:gwang@gopivotal.com] 
Sent: Thursday, March 06, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: MR2 Job over LZO data

 

You can try to get the source code https://github.com/twitter/hadoop-lzo
and then compile it against hadoop 2.2.0.

 

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

Running on Hadoop 2.2.0

 

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.

But when using the LZO format the job fails:

import com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/

Also tried with elephant-bird-core 4.4

 

The same data can be queried fine from within Hive(0.12) on the same
cluster.

 

 

The exception:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected

at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6
2)

at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:340)

at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)

at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:392)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1491)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at com.cloudreach.DataQuality.Main.main(Main.java:42)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

 

Thanks





 

-- 

Regards

Gordon Wang


RE: MR2 Job over LZO data

Posted by German Florez-Larrahondo <ge...@samsung.com>.
King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0

 

I hope this helps

 

./g

 

 

Where to get Hadoop LZO

https://github.com/twitter/hadoop-lzo

 

http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo
-compression.html

 

Requirements

On cents:

sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2..

 

On ubuntu: 

sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2   

 

Clone:

git clone https://github.com/twitter/hadoop-lzo.git

 

Follow instructions on README.md from this github site, basically

 

 cd hadoop-lzo

     mvn clean package  test

 

To enable this at run time do:

a.       Copy the library to the hadoop/share/common (if  you don't want to
modify classpaths by putting the library somewhere else)

 

cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
hadoop/share/hadoop/common/

 

a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/

 

 

From: Gordon Wang [mailto:gwang@gopivotal.com] 
Sent: Thursday, March 06, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: MR2 Job over LZO data

 

You can try to get the source code https://github.com/twitter/hadoop-lzo
and then compile it against hadoop 2.2.0.

 

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

Running on Hadoop 2.2.0

 

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.

But when using the LZO format the job fails:

import com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/

Also tried with elephant-bird-core 4.4

 

The same data can be queried fine from within Hive(0.12) on the same
cluster.

 

 

The exception:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected

at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6
2)

at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:340)

at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)

at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:392)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1491)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at com.cloudreach.DataQuality.Main.main(Main.java:42)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

 

Thanks





 

-- 

Regards

Gordon Wang


RE: MR2 Job over LZO data

Posted by German Florez-Larrahondo <ge...@samsung.com>.
King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0

 

I hope this helps

 

./g

 

 

Where to get Hadoop LZO

https://github.com/twitter/hadoop-lzo

 

http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo
-compression.html

 

Requirements

On cents:

sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2..

 

On ubuntu: 

sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2   

 

Clone:

git clone https://github.com/twitter/hadoop-lzo.git

 

Follow instructions on README.md from this github site, basically

 

 cd hadoop-lzo

     mvn clean package  test

 

To enable this at run time do:

a.       Copy the library to the hadoop/share/common (if  you don't want to
modify classpaths by putting the library somewhere else)

 

cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
hadoop/share/hadoop/common/

 

a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/

 

 

From: Gordon Wang [mailto:gwang@gopivotal.com] 
Sent: Thursday, March 06, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: MR2 Job over LZO data

 

You can try to get the source code https://github.com/twitter/hadoop-lzo
and then compile it against hadoop 2.2.0.

 

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

Running on Hadoop 2.2.0

 

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.

But when using the LZO format the job fails:

import com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/

Also tried with elephant-bird-core 4.4

 

The same data can be queried fine from within Hive(0.12) on the same
cluster.

 

 

The exception:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected

at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6
2)

at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:340)

at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)

at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:392)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1491)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at com.cloudreach.DataQuality.Main.main(Main.java:42)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

 

Thanks





 

-- 

Regards

Gordon Wang


RE: MR2 Job over LZO data

Posted by German Florez-Larrahondo <ge...@samsung.com>.
King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0

 

I hope this helps

 

./g

 

 

Where to get Hadoop LZO

https://github.com/twitter/hadoop-lzo

 

http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo
-compression.html

 

Requirements

On cents:

sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2..

 

On ubuntu: 

sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2   

 

Clone:

git clone https://github.com/twitter/hadoop-lzo.git

 

Follow instructions on README.md from this github site, basically

 

 cd hadoop-lzo

     mvn clean package  test

 

To enable this at run time do:

a.       Copy the library to the hadoop/share/common (if  you don't want to
modify classpaths by putting the library somewhere else)

 

cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
hadoop/share/hadoop/common/

 

a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/

 

 

From: Gordon Wang [mailto:gwang@gopivotal.com] 
Sent: Thursday, March 06, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: MR2 Job over LZO data

 

You can try to get the source code https://github.com/twitter/hadoop-lzo
and then compile it against hadoop 2.2.0.

 

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

Running on Hadoop 2.2.0

 

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.

But when using the LZO format the job fails:

import com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/

Also tried with elephant-bird-core 4.4

 

The same data can be queried fine from within Hive(0.12) on the same
cluster.

 

 

The exception:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected

at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6
2)

at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:340)

at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)

at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:392)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1491)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at com.cloudreach.DataQuality.Main.main(Main.java:42)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

 

Thanks





 

-- 

Regards

Gordon Wang


Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
You can try to get the source code
https://github.com/twitter/hadoop-lzo and then compile it against
hadoop 2.2.0.

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0


On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Stanley Shi <ss...@gopivotal.com>.
May be you can try download the LZO class and rebuild it against Hadoop
2.2.0;
If build success, you should be good to go;
if failed, then maybe you need to wait for the LZO guys to update their
code.

Regards,
*Stanley Shi,*



On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
You can try to get the source code
https://github.com/twitter/hadoop-lzo and then compile it against
hadoop 2.2.0.

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0


On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
You can try to get the source code
https://github.com/twitter/hadoop-lzo and then compile it against
hadoop 2.2.0.

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0


On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>



-- 
Regards
Gordon Wang

Re: MR2 Job over LZO data

Posted by Stanley Shi <ss...@gopivotal.com>.
May be you can try download the LZO class and rebuild it against Hadoop
2.2.0;
If build success, you should be good to go;
if failed, then maybe you need to wait for the LZO guys to update their
code.

Regards,
*Stanley Shi,*



On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>

Re: MR2 Job over LZO data

Posted by Gordon Wang <gw...@gopivotal.com>.
You can try to get the source code
https://github.com/twitter/hadoop-lzo and then compile it against
hadoop 2.2.0.

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0


On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <ki...@gmail.com> wrote:

> Running on Hadoop 2.2.0
>
> The Java MR2 job works as expected on an uncompressed data source using
> the TextInputFormat.class.
> But when using the LZO format the job fails:
> import com.hadoop.mapreduce.LzoTextInputFormat;
> job.setInputFormatClass(LzoTextInputFormat.class);
>
> Dependencies from the maven repository:
> http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/
> Also tried with elephant-bird-core 4.4
>
> The same data can be queried fine from within Hive(0.12) on the same
> cluster.
>
>
> The exception:
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
>  at
> com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
>  at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
>  at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>  at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
>  at com.cloudreach.DataQuality.Main.main(Main.java:42)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
> I believe the issue is related to the changes in Hadoop 2, but where can I
> find a H2 compatible version?
>
> Thanks
>



-- 
Regards
Gordon Wang