You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/09/11 20:22:10 UTC

Reading BytesWritable in sequence file

Is there a way to read BytesWritable using sequence file loader from
piggybank? If not then how should I go about implementing one?

Re: Reading BytesWritable in sequence file

Posted by Adam Kawa <ka...@gmail.com>.
What I did was to:

1) extend FileInputLoadFunc class
2) add new type (BYTES_WRITABLE) to swich-case in methods:
inferPigDataType(Type t) and translateWritableToPigDataType(Writable
w, byte dataType)
3) use it as loader

Probably it is not the best way to solve this issue.

2012/9/15 Aniket Mokashi <an...@gmail.com>:
> For a simpler use case, something similar to following should work-
>
> public class PigSequenceFileLoader extends PigStorage {
>
> @SuppressWarnings("rawtypes")
>
> @Override
>
> public InputFormat getInputFormat() {
>
>  return new SequenceFileInputFormat<ByteWritable, Text>();
>
> }
>
> }
>
> Thanks,
>
> Aniket
>
> On Thu, Sep 13, 2012 at 1:24 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
>> Install protocol buffers 2.3 and thrift 0.5
>>
>> From the readme:
>>
>> Protocol Buffer and Thrift compiler dependencies
>> Elephant Bird requires Protocol Buffer compiler version 2.3 at build
>> time, as generated classes are used internally. Thrift compiler is
>> required to generate classes used in tests. As these are native-code
>> tools they must be installed on the build machine (java library
>> dependencies are pulled from maven repositories during the build).
>>
>>
>>
>> D
>>
>> On Wed, Sep 12, 2012 at 9:01 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>> > I got this error when I ran mvn package
>> >
>> > [ERROR] Failed to execute goal
>> > com.github.igor-petruk.protobuf:protobuf-maven-pl
>> > ugin:0.4:run (default) on project elephant-bird-core: Unable to find
>> > 'protoc' ->
>> >  [Help 1]
>> > [ERROR]
>> >
>> > On Tue, Sep 11, 2012 at 4:24 PM, Mohit Anchlia <mohitanchlia@gmail.com
>> >wrote:
>> >
>> >> Thanks! I'll try it out.
>> >>
>> >>
>> >> On Tue, Sep 11, 2012 at 4:21 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
>> >wrote:
>> >>
>> >>> Yup:
>> >>> https://github.com/kevinweil/elephant-bird
>> >>>
>> >>> D
>> >>>
>> >>> On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mohitanchlia@gmail.com
>> >
>> >>> wrote:
>> >>> > Is it the code that I checkout and build?
>> >>> >
>> >>> > On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> Try the one in Elephant-Bird.
>> >>> >>
>> >>> >> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <
>> >>> mohitanchlia@gmail.com>
>> >>> >> wrote:
>> >>> >> > Is there a way to read BytesWritable using sequence file loader
>> from
>> >>> >> > piggybank? If not then how should I go about implementing one?
>> >>> >>
>> >>>
>> >>
>> >>
>>
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"

Re: Reading BytesWritable in sequence file

Posted by Aniket Mokashi <an...@gmail.com>.
For a simpler use case, something similar to following should work-

public class PigSequenceFileLoader extends PigStorage {

@SuppressWarnings("rawtypes")

@Override

public InputFormat getInputFormat() {

 return new SequenceFileInputFormat<ByteWritable, Text>();

}

}

Thanks,

Aniket

On Thu, Sep 13, 2012 at 1:24 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Install protocol buffers 2.3 and thrift 0.5
>
> From the readme:
>
> Protocol Buffer and Thrift compiler dependencies
> Elephant Bird requires Protocol Buffer compiler version 2.3 at build
> time, as generated classes are used internally. Thrift compiler is
> required to generate classes used in tests. As these are native-code
> tools they must be installed on the build machine (java library
> dependencies are pulled from maven repositories during the build).
>
>
>
> D
>
> On Wed, Sep 12, 2012 at 9:01 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
> > I got this error when I ran mvn package
> >
> > [ERROR] Failed to execute goal
> > com.github.igor-petruk.protobuf:protobuf-maven-pl
> > ugin:0.4:run (default) on project elephant-bird-core: Unable to find
> > 'protoc' ->
> >  [Help 1]
> > [ERROR]
> >
> > On Tue, Sep 11, 2012 at 4:24 PM, Mohit Anchlia <mohitanchlia@gmail.com
> >wrote:
> >
> >> Thanks! I'll try it out.
> >>
> >>
> >> On Tue, Sep 11, 2012 at 4:21 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
> >wrote:
> >>
> >>> Yup:
> >>> https://github.com/kevinweil/elephant-bird
> >>>
> >>> D
> >>>
> >>> On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mohitanchlia@gmail.com
> >
> >>> wrote:
> >>> > Is it the code that I checkout and build?
> >>> >
> >>> > On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Try the one in Elephant-Bird.
> >>> >>
> >>> >> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <
> >>> mohitanchlia@gmail.com>
> >>> >> wrote:
> >>> >> > Is there a way to read BytesWritable using sequence file loader
> from
> >>> >> > piggybank? If not then how should I go about implementing one?
> >>> >>
> >>>
> >>
> >>
>



-- 
"...:::Aniket:::... Quetzalco@tl"

Re: Reading BytesWritable in sequence file

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Install protocol buffers 2.3 and thrift 0.5

>From the readme:

Protocol Buffer and Thrift compiler dependencies
Elephant Bird requires Protocol Buffer compiler version 2.3 at build
time, as generated classes are used internally. Thrift compiler is
required to generate classes used in tests. As these are native-code
tools they must be installed on the build machine (java library
dependencies are pulled from maven repositories during the build).



D

On Wed, Sep 12, 2012 at 9:01 PM, Mohit Anchlia <mo...@gmail.com> wrote:
> I got this error when I ran mvn package
>
> [ERROR] Failed to execute goal
> com.github.igor-petruk.protobuf:protobuf-maven-pl
> ugin:0.4:run (default) on project elephant-bird-core: Unable to find
> 'protoc' ->
>  [Help 1]
> [ERROR]
>
> On Tue, Sep 11, 2012 at 4:24 PM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Thanks! I'll try it out.
>>
>>
>> On Tue, Sep 11, 2012 at 4:21 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>
>>> Yup:
>>> https://github.com/kevinweil/elephant-bird
>>>
>>> D
>>>
>>> On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>> > Is it the code that I checkout and build?
>>> >
>>> > On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com>
>>> wrote:
>>> >
>>> >> Try the one in Elephant-Bird.
>>> >>
>>> >> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <
>>> mohitanchlia@gmail.com>
>>> >> wrote:
>>> >> > Is there a way to read BytesWritable using sequence file loader from
>>> >> > piggybank? If not then how should I go about implementing one?
>>> >>
>>>
>>
>>

Re: Reading BytesWritable in sequence file

Posted by Mohit Anchlia <mo...@gmail.com>.
I got this error when I ran mvn package

[ERROR] Failed to execute goal
com.github.igor-petruk.protobuf:protobuf-maven-pl
ugin:0.4:run (default) on project elephant-bird-core: Unable to find
'protoc' ->
 [Help 1]
[ERROR]

On Tue, Sep 11, 2012 at 4:24 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> Thanks! I'll try it out.
>
>
> On Tue, Sep 11, 2012 at 4:21 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> Yup:
>> https://github.com/kevinweil/elephant-bird
>>
>> D
>>
>> On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>> > Is it the code that I checkout and build?
>> >
>> > On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com>
>> wrote:
>> >
>> >> Try the one in Elephant-Bird.
>> >>
>> >> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <
>> mohitanchlia@gmail.com>
>> >> wrote:
>> >> > Is there a way to read BytesWritable using sequence file loader from
>> >> > piggybank? If not then how should I go about implementing one?
>> >>
>>
>
>

Re: Reading BytesWritable in sequence file

Posted by Mohit Anchlia <mo...@gmail.com>.
Thanks! I'll try it out.

On Tue, Sep 11, 2012 at 4:21 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Yup:
> https://github.com/kevinweil/elephant-bird
>
> D
>
> On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
> > Is it the code that I checkout and build?
> >
> > On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com>
> wrote:
> >
> >> Try the one in Elephant-Bird.
> >>
> >> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <mohitanchlia@gmail.com
> >
> >> wrote:
> >> > Is there a way to read BytesWritable using sequence file loader from
> >> > piggybank? If not then how should I go about implementing one?
> >>
>

Re: Reading BytesWritable in sequence file

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Yup:
https://github.com/kevinweil/elephant-bird

D

On Tue, Sep 11, 2012 at 4:00 PM, Mohit Anchlia <mo...@gmail.com> wrote:
> Is it the code that I checkout and build?
>
> On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
>> Try the one in Elephant-Bird.
>>
>> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>> > Is there a way to read BytesWritable using sequence file loader from
>> > piggybank? If not then how should I go about implementing one?
>>

Re: Reading BytesWritable in sequence file

Posted by Mohit Anchlia <mo...@gmail.com>.
Is it the code that I checkout and build?

On Tue, Sep 11, 2012 at 3:27 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Try the one in Elephant-Bird.
>
> On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <mo...@gmail.com>
> wrote:
> > Is there a way to read BytesWritable using sequence file loader from
> > piggybank? If not then how should I go about implementing one?
>

Re: Reading BytesWritable in sequence file

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Try the one in Elephant-Bird.

On Tue, Sep 11, 2012 at 11:22 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> Is there a way to read BytesWritable using sequence file loader from
> piggybank? If not then how should I go about implementing one?