You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by abhishek dodda <ab...@gmail.com> on 2014/05/20 08:46:31 UTC
Reading sequence file in pig
Hi All,
I have trouble building code for this project.
https://github.com/kevinweil/elephant-bird
can some one tell how to read sequence files in pig.
--
Thanks,
Abhishek
Re: Reading sequence file in pig
Posted by abhishek dodda <ab...@gmail.com>.
This File output from org.apache.hcatalog.pig.HCatStorer function
On Tue, May 20, 2014 at 10:44 AM, abhishek dodda
<ab...@gmail.com>wrote:
> Iam getting this error
>
> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>
> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
> class org.apache.hadoop.io.NullWritable to a Pig datatype
>
>
> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
>
>
>
> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
>
>> You can use the SequenceFileLoader from the piggybank.
>>
>>
>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>
>>
>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>> <ab...@gmail.com>wrote:
>>
>> > Hi All,
>> >
>> > I have trouble building code for this project.
>> >
>> > https://github.com/kevinweil/elephant-bird
>> >
>> > can some one tell how to read sequence files in pig.
>> >
>> > --
>> > Thanks,
>> > Abhishek
>> >
>>
>
>
>
> --
> Thanks,
> Abhishek
> 2018509769
>
--
Thanks,
Abhishek
2018509769
Re: Reading sequence file in pig
Posted by abhishek dodda <ab...@gmail.com>.
Hi Pradeep,
Thank you for all the help. Following thing works
REGISTER /home/adodda/elephant-bird-pig-4.5.jar;
REGISTER /home/adodda/elephant-bird-pig-4.5-sources.jar;
REGISTER /home/xyz/elephant-bird-core-4.5-sources.jar;
REGISTER /home/xyz/elephant-bird-core-4.5.jar;
REGISTER /home/xyz/elephant-bird-hadoop-compat-4.5.jar;
A = load '/etl/table=04' using
com.twitter.elephantbird.pig.load.SequenceFileLoader
('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
com.twitter.elephantbird.pig.util.TextConverter')
AS (key,value:chararray);
Thanks
Abhishek
On Wed, May 21, 2014 at 12:14 PM, abhishek dodda
<ab...@gmail.com>wrote:
> Not working yet
>
> A = load '/etl/table=04' using
> com.twitter.elephantbird.pig.load.SequenceFileLoader
> >> ('-c com.twitter.elephantbird.pig.util.TextConverter','-c
> com.twitter.elephantbird.pig.util.TextConverter')
> >> AS (key,value:chararray);
>
> ERROR 2998: Unhandled internal error.
> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
> java.lang.NoClassDefFoundError:
> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
>
> grunt> A = load '/etl/table=04' using
> com.twitter.elephantbird.pig.load.SequenceFileLoader
> >> ('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
> com.twitter.elephantbird.pig.util.TextConverter')
> >> AS (key,value:chararray);
>
> ERROR 2998: Unhandled internal error.
> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
> java.lang.NoClassDefFoundError:
> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
>
> Should i register more jars.
>
> REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
> REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
> REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
>
>
>
>
> On Wed, May 21, 2014 at 11:54 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
>
>> That is because null is not a datatype in Pig.
>> http://pig.apache.org/docs/r0.12.1/basic.html#data-types
>>
>> If fact, you don't need to specify a type at all for aliases.
>>
>> Try, (key, value: chararray).
>>
>>
>> On Wed, May 21, 2014 at 2:21 PM, abhishek dodda <
>> abhishek.dodda1@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
>>> REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
>>> REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
>>>
>>>
>>> A = load '/etl/table=04' using
>>> com.twitter.elephantbird.pig.load.SequenceFileLoader
>>> ('-c com.twitter.elephantbird.pig.util.TextConverter','-c
>>> com.twitter.elephantbird.pig.util.TextConverter')
>>> AS (key:chararray,value:chararray);
>>>
>>>
>>> 2014-05-21 18:10:53,391 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 2998: Unhandled internal error.
>>> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
>>> Details at logfile: /home/xyz/pig_1400694772994.log
>>>
>>> A = load '/etl/table=04' using
>>> com.twitter.elephantbird.pig.load.SequenceFileLoader
>>> ('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
>>> com.twitter.elephantbird.pig.util.TextConverter')
>>> AS (key:null,value:chararray);
>>>
>>> Also tried NullWritable as key
>>>
>>> 2014-05-21 18:11:58,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1200: <line 11, column 9> Syntax error, unexpected symbol at or near
>>> 'null'
>>> Details at logfile: /home/xyz/pig_1400694772994.log
>>>
>>> None of them worked. I am something missing here ?
>>>
>>>
>>>
>>>
>>> On Tue, May 20, 2014 at 9:12 PM, Pradeep Gollakota <pradeepg26@gmail.com
>>> > wrote:
>>>
>>>> Sorry,
>>>>
>>>> Missed the part about loading custom types from SequenceFiles. The
>>>> LoadFunc from piggybank will only load pig types. However, (as you already
>>>> know), you can use elephant-bird. Not sure why you need to build it. The
>>>> artifact exists in maven central.
>>>>
>>>>
>>>> http://search.maven.org/#artifactdetails%7Ccom.twitter.elephantbird%7Celephant-bird-pig%7C4.5%7Cjar
>>>>
>>>> Hope this helps.
>>>>
>>>>
>>>> On Tue, May 20, 2014 at 1:44 PM, abhishek dodda <
>>>> abhishek.dodda1@gmail.com> wrote:
>>>>
>>>>> Iam getting this error
>>>>>
>>>>> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
>>>>> class org.apache.hadoop.io.NullWritable to a Pig datatype
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
>>>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
>>>>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>>>>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>>>>> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>>>>> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <
>>>>> pradeepg26@gmail.com> wrote:
>>>>>
>>>>>> You can use the SequenceFileLoader from the piggybank.
>>>>>>
>>>>>>
>>>>>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>>>>>
>>>>>>
>>>>>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>>>>>> <ab...@gmail.com>wrote:
>>>>>>
>>>>>> > Hi All,
>>>>>> >
>>>>>> > I have trouble building code for this project.
>>>>>> >
>>>>>> > https://github.com/kevinweil/elephant-bird
>>>>>> >
>>>>>> > can some one tell how to read sequence files in pig.
>>>>>> >
>>>>>> > --
>>>>>> > Thanks,
>>>>>> > Abhishek
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Abhishek
>>>>> 2018509769
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Abhishek
>>> 2018509769
>>>
>>
>>
>
>
> --
> Thanks,
> Abhishek
> 2018509769
>
--
Thanks,
Abhishek
2018509769
Re: Reading sequence file in pig
Posted by abhishek dodda <ab...@gmail.com>.
Not working yet
A = load '/etl/table=04' using
com.twitter.elephantbird.pig.load.SequenceFileLoader
>> ('-c com.twitter.elephantbird.pig.util.TextConverter','-c
com.twitter.elephantbird.pig.util.TextConverter')
>> AS (key,value:chararray);
ERROR 2998: Unhandled internal error.
com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
java.lang.NoClassDefFoundError:
com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
grunt> A = load '/etl/table=04' using
com.twitter.elephantbird.pig.load.SequenceFileLoader
>> ('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
com.twitter.elephantbird.pig.util.TextConverter')
>> AS (key,value:chararray);
ERROR 2998: Unhandled internal error.
com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
java.lang.NoClassDefFoundError:
com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
Should i register more jars.
REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
On Wed, May 21, 2014 at 11:54 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
> That is because null is not a datatype in Pig.
> http://pig.apache.org/docs/r0.12.1/basic.html#data-types
>
> If fact, you don't need to specify a type at all for aliases.
>
> Try, (key, value: chararray).
>
>
> On Wed, May 21, 2014 at 2:21 PM, abhishek dodda <abhishek.dodda1@gmail.com
> > wrote:
>
>> Hi,
>>
>> REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
>> REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
>> REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
>>
>>
>> A = load '/etl/table=04' using
>> com.twitter.elephantbird.pig.load.SequenceFileLoader
>> ('-c com.twitter.elephantbird.pig.util.TextConverter','-c
>> com.twitter.elephantbird.pig.util.TextConverter')
>> AS (key:chararray,value:chararray);
>>
>>
>> 2014-05-21 18:10:53,391 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 2998: Unhandled internal error.
>> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
>> Details at logfile: /home/xyz/pig_1400694772994.log
>>
>> A = load '/etl/table=04' using
>> com.twitter.elephantbird.pig.load.SequenceFileLoader
>> ('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
>> com.twitter.elephantbird.pig.util.TextConverter')
>> AS (key:null,value:chararray);
>>
>> Also tried NullWritable as key
>>
>> 2014-05-21 18:11:58,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1200: <line 11, column 9> Syntax error, unexpected symbol at or near
>> 'null'
>> Details at logfile: /home/xyz/pig_1400694772994.log
>>
>> None of them worked. I am something missing here ?
>>
>>
>>
>>
>> On Tue, May 20, 2014 at 9:12 PM, Pradeep Gollakota <pr...@gmail.com>wrote:
>>
>>> Sorry,
>>>
>>> Missed the part about loading custom types from SequenceFiles. The
>>> LoadFunc from piggybank will only load pig types. However, (as you already
>>> know), you can use elephant-bird. Not sure why you need to build it. The
>>> artifact exists in maven central.
>>>
>>>
>>> http://search.maven.org/#artifactdetails%7Ccom.twitter.elephantbird%7Celephant-bird-pig%7C4.5%7Cjar
>>>
>>> Hope this helps.
>>>
>>>
>>> On Tue, May 20, 2014 at 1:44 PM, abhishek dodda <
>>> abhishek.dodda1@gmail.com> wrote:
>>>
>>>> Iam getting this error
>>>>
>>>> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
>>>> class org.apache.hadoop.io.NullWritable to a Pig datatype
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
>>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
>>>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>>>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>>>> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>>>> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>
>>>>
>>>>
>>>> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <
>>>> pradeepg26@gmail.com> wrote:
>>>>
>>>>> You can use the SequenceFileLoader from the piggybank.
>>>>>
>>>>>
>>>>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>>>>
>>>>>
>>>>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>>>>> <ab...@gmail.com>wrote:
>>>>>
>>>>> > Hi All,
>>>>> >
>>>>> > I have trouble building code for this project.
>>>>> >
>>>>> > https://github.com/kevinweil/elephant-bird
>>>>> >
>>>>> > can some one tell how to read sequence files in pig.
>>>>> >
>>>>> > --
>>>>> > Thanks,
>>>>> > Abhishek
>>>>> >
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Abhishek
>>>> 2018509769
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Abhishek
>> 2018509769
>>
>
>
--
Thanks,
Abhishek
2018509769
Re: Reading sequence file in pig
Posted by Pradeep Gollakota <pr...@gmail.com>.
That is because null is not a datatype in Pig.
http://pig.apache.org/docs/r0.12.1/basic.html#data-types
If fact, you don't need to specify a type at all for aliases.
Try, (key, value: chararray).
On Wed, May 21, 2014 at 2:21 PM, abhishek dodda
<ab...@gmail.com>wrote:
> Hi,
>
> REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
> REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
> REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
>
>
> A = load '/etl/table=04' using
> com.twitter.elephantbird.pig.load.SequenceFileLoader
> ('-c com.twitter.elephantbird.pig.util.TextConverter','-c
> com.twitter.elephantbird.pig.util.TextConverter')
> AS (key:chararray,value:chararray);
>
>
> 2014-05-21 18:10:53,391 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2998: Unhandled internal error.
> com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
> Details at logfile: /home/xyz/pig_1400694772994.log
>
> A = load '/etl/table=04' using
> com.twitter.elephantbird.pig.load.SequenceFileLoader
> ('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
> com.twitter.elephantbird.pig.util.TextConverter')
> AS (key:null,value:chararray);
>
> Also tried NullWritable as key
>
> 2014-05-21 18:11:58,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1200: <line 11, column 9> Syntax error, unexpected symbol at or near
> 'null'
> Details at logfile: /home/xyz/pig_1400694772994.log
>
> None of them worked. I am something missing here ?
>
>
>
>
> On Tue, May 20, 2014 at 9:12 PM, Pradeep Gollakota <pr...@gmail.com>wrote:
>
>> Sorry,
>>
>> Missed the part about loading custom types from SequenceFiles. The
>> LoadFunc from piggybank will only load pig types. However, (as you already
>> know), you can use elephant-bird. Not sure why you need to build it. The
>> artifact exists in maven central.
>>
>>
>> http://search.maven.org/#artifactdetails%7Ccom.twitter.elephantbird%7Celephant-bird-pig%7C4.5%7Cjar
>>
>> Hope this helps.
>>
>>
>> On Tue, May 20, 2014 at 1:44 PM, abhishek dodda <
>> abhishek.dodda1@gmail.com> wrote:
>>
>>> Iam getting this error
>>>
>>> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>>>
>>>
>>>
>>>
>>> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
>>> class org.apache.hadoop.io.NullWritable to a Pig datatype
>>>
>>>
>>>
>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
>>> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
>>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>>> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>>> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>
>>>
>>>
>>> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <pradeepg26@gmail.com
>>> > wrote:
>>>
>>>> You can use the SequenceFileLoader from the piggybank.
>>>>
>>>>
>>>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>>>
>>>>
>>>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>>>> <ab...@gmail.com>wrote:
>>>>
>>>> > Hi All,
>>>> >
>>>> > I have trouble building code for this project.
>>>> >
>>>> > https://github.com/kevinweil/elephant-bird
>>>> >
>>>> > can some one tell how to read sequence files in pig.
>>>> >
>>>> > --
>>>> > Thanks,
>>>> > Abhishek
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Abhishek
>>> 2018509769
>>>
>>
>>
>
>
> --
> Thanks,
> Abhishek
> 2018509769
>
Re: Reading sequence file in pig
Posted by abhishek dodda <ab...@gmail.com>.
Hi,
REGISTER /home/xyz/elephant-bird-pig-4.5.jar;
REGISTER /home/xyz/elephant-bird-pig-4.5-sources.jar;
REGISTER /home/xyz/elephant-bird-pig-4.5-tests.jar;
A = load '/etl/table=04' using
com.twitter.elephantbird.pig.load.SequenceFileLoader
('-c com.twitter.elephantbird.pig.util.TextConverter','-c
com.twitter.elephantbird.pig.util.TextConverter')
AS (key:chararray,value:chararray);
2014-05-21 18:10:53,391 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2998: Unhandled internal error.
com/twitter/elephantbird/mapreduce/input/RawSequenceFileInputFormat
Details at logfile: /home/xyz/pig_1400694772994.log
A = load '/etl/table=04' using
com.twitter.elephantbird.pig.load.SequenceFileLoader
('-c com.twitter.elephantbird.pig.util.NullWritableConverter','-c
com.twitter.elephantbird.pig.util.TextConverter')
AS (key:null,value:chararray);
Also tried NullWritable as key
2014-05-21 18:11:58,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1200: <line 11, column 9> Syntax error, unexpected symbol at or near
'null'
Details at logfile: /home/xyz/pig_1400694772994.log
None of them worked. I am something missing here ?
On Tue, May 20, 2014 at 9:12 PM, Pradeep Gollakota <pr...@gmail.com>wrote:
> Sorry,
>
> Missed the part about loading custom types from SequenceFiles. The
> LoadFunc from piggybank will only load pig types. However, (as you already
> know), you can use elephant-bird. Not sure why you need to build it. The
> artifact exists in maven central.
>
>
> http://search.maven.org/#artifactdetails%7Ccom.twitter.elephantbird%7Celephant-bird-pig%7C4.5%7Cjar
>
> Hope this helps.
>
>
> On Tue, May 20, 2014 at 1:44 PM, abhishek dodda <abhishek.dodda1@gmail.com
> > wrote:
>
>> Iam getting this error
>>
>> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>>
>>
>>
>> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
>> class org.apache.hadoop.io.NullWritable to a Pig datatype
>>
>>
>>
>> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
>> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>>
>>
>> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
>>
>>> You can use the SequenceFileLoader from the piggybank.
>>>
>>>
>>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>>
>>>
>>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>>> <ab...@gmail.com>wrote:
>>>
>>> > Hi All,
>>> >
>>> > I have trouble building code for this project.
>>> >
>>> > https://github.com/kevinweil/elephant-bird
>>> >
>>> > can some one tell how to read sequence files in pig.
>>> >
>>> > --
>>> > Thanks,
>>> > Abhishek
>>> >
>>>
>>
>>
>>
>> --
>> Thanks,
>> Abhishek
>> 2018509769
>>
>
>
--
Thanks,
Abhishek
2018509769
Re: Reading sequence file in pig
Posted by Pradeep Gollakota <pr...@gmail.com>.
Sorry,
Missed the part about loading custom types from SequenceFiles. The LoadFunc
from piggybank will only load pig types. However, (as you already know),
you can use elephant-bird. Not sure why you need to build it. The artifact
exists in maven central.
http://search.maven.org/#artifactdetails%7Ccom.twitter.elephantbird%7Celephant-bird-pig%7C4.5%7Cjar
Hope this helps.
On Tue, May 20, 2014 at 1:44 PM, abhishek dodda
<ab...@gmail.com>wrote:
> Iam getting this error
>
> A = load '/a/part-m-0000' using org.apache.pig.piggybank.storage.SequenceFileLoader();
>
> org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
> class org.apache.hadoop.io.NullWritable to a Pig datatype
>
>
> at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
> at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
> at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
>
>
>
> On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
>
>> You can use the SequenceFileLoader from the piggybank.
>>
>>
>> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>>
>>
>> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
>> <ab...@gmail.com>wrote:
>>
>> > Hi All,
>> >
>> > I have trouble building code for this project.
>> >
>> > https://github.com/kevinweil/elephant-bird
>> >
>> > can some one tell how to read sequence files in pig.
>> >
>> > --
>> > Thanks,
>> > Abhishek
>> >
>>
>
>
>
> --
> Thanks,
> Abhishek
> 2018509769
>
Re: Reading sequence file in pig
Posted by abhishek dodda <ab...@gmail.com>.
Iam getting this error
A = load '/a/part-m-0000' using
org.apache.pig.piggybank.storage.SequenceFileLoader();
org.apache.pig.backend.BackendException: ERROR 0: Unable to translate class
org.apache.hadoop.io.NullWritable to a Pig datatype
at org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:81)
at org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:138)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
On Tue, May 20, 2014 at 5:41 AM, Pradeep Gollakota <pr...@gmail.com>wrote:
> You can use the SequenceFileLoader from the piggybank.
>
>
> http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
>
>
> On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
> <ab...@gmail.com>wrote:
>
> > Hi All,
> >
> > I have trouble building code for this project.
> >
> > https://github.com/kevinweil/elephant-bird
> >
> > can some one tell how to read sequence files in pig.
> >
> > --
> > Thanks,
> > Abhishek
> >
>
--
Thanks,
Abhishek
2018509769
Re: Reading sequence file in pig
Posted by Pradeep Gollakota <pr...@gmail.com>.
You can use the SequenceFileLoader from the piggybank.
http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
On Tue, May 20, 2014 at 2:46 AM, abhishek dodda
<ab...@gmail.com>wrote:
> Hi All,
>
> I have trouble building code for this project.
>
> https://github.com/kevinweil/elephant-bird
>
> can some one tell how to read sequence files in pig.
>
> --
> Thanks,
> Abhishek
>