You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by devdoer bird <de...@gmail.com> on 2012/01/06 11:01:50 UTC

Can I replace the pig backend with hadoop streming?

HI:

I want to implement a new pig backend  . Can I replache the hadoop backend
with a hadoop--streaming only backend?

I decide to use streaming to implement backend.storage and backend
.executionengine interface , but I want to know   whether it's a right way
to do so.

Thanks.

Re: Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
Thanks.

2012/1/10 Dmitriy Ryaboy <dv...@gmail.com>

> Oh and as luck would have it, it was made for the 0.4-0.5 release.
>
> On Mon, Jan 9, 2012 at 1:13 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> > I had a patch that introduced shims, which included shims to make Pig
> work
> > with 19. It's woefully out of date at this point, but you can at least
> use
> > it as a starting point: https://issues.apache.org/jira/browse/PIG-924
> >
> >
> > On Mon, Jan 9, 2012 at 11:23 AM, Daniel Dai <da...@hortonworks.com>
> wrote:
> >
> >> There are some API incompatible changes between 0.19 and 0.20. You will
> >> need to change current Pig code in order to compile/run against 0.19.
> But
> >> the cost of downgrade hadoop should be much lower than write a new
> >> backend.
> >> You can check PIG-660 (in which we upgrade hadoop 18 to 20) to get some
> >> idea.
> >>
> >> Daniel
> >>
> >> On Mon, Jan 9, 2012 at 2:01 AM, devdoer bird <de...@gmail.com>
> wrote:
> >>
> >> > If possible I want to use the official pig distribution, but the
> >> company's
> >> > hadoop (from hadoop 0.19) backend has changed a lot of api from the
> >> > official one which I can't get the documentaion.
> >> >
> >> > I tried to fix the imcompatial problem   by making  pig 0.9.1  work
> with
> >> > official hadoop 0.19 ,then work with my company's private hadoop
> >> > distribution . But I don't where to begin.
> >> >
> >> >  Any help will be welcome.
> >> >
> >> > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> >> >
> >> > > What error did you see when you use official Pig distribution? It's
> >> > > non-trivial to write a backend even if possible (Especially for 0.5,
> >> it
> >> > > would be hard to get help from the community)
> >> > >
> >> > > On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com>
> >> > wrote:
> >> > >
> >> > > > So I think Streaming backend may be a good solution for this
> >> situation.
> >> > > >
> >> > > > 2012/1/9 devdoer bird <de...@gmail.com>
> >> > > >
> >> > > > > The reason I decide use Streaming as backend is that  In my
> >> company
> >> > the
> >> > > > > hadoop has been modified  so it might not be compatible with the
> >> > > official
> >> > > > > hadoop distribution. And I can't make  the pig run on our
> private
> >> > > hadoop
> >> > > > > distribution.
> >> > > > >
> >> > > > >
> >> > > > > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> >> > > > >
> >> > > > >> Pig do have a abstraction layer in execution engine. But that
> is
> >> > > mostly
> >> > > > a
> >> > > > >> legacy of early versions. In recent development, we never keep
> >> > > platform
> >> > > > >> neutral in mind so I don't know how reliable this interface is.
> >> Can
> >> > > you
> >> > > > >> elaborate your idea so we may find a better solution?
> >> > > > >>
> >> > > > >> Daniel
> >> > > > >>
> >> > > > >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <
> devdoer2@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >>
> >> > > > >> > HI:
> >> > > > >> >
> >> > > > >> > I want to implement a new pig backend  . Can I replache the
> >> hadoop
> >> > > > >> backend
> >> > > > >> > with a hadoop--streaming only backend?
> >> > > > >> >
> >> > > > >> > I decide to use streaming to implement backend.storage and
> >> backend
> >> > > > >> > .executionengine interface , but I want to know   whether
> it's
> >> a
> >> > > right
> >> > > > >> way
> >> > > > >> > to do so.
> >> > > > >> >
> >> > > > >> > Thanks.
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Oh and as luck would have it, it was made for the 0.4-0.5 release.

On Mon, Jan 9, 2012 at 1:13 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> I had a patch that introduced shims, which included shims to make Pig work
> with 19. It's woefully out of date at this point, but you can at least use
> it as a starting point: https://issues.apache.org/jira/browse/PIG-924
>
>
> On Mon, Jan 9, 2012 at 11:23 AM, Daniel Dai <da...@hortonworks.com> wrote:
>
>> There are some API incompatible changes between 0.19 and 0.20. You will
>> need to change current Pig code in order to compile/run against 0.19. But
>> the cost of downgrade hadoop should be much lower than write a new
>> backend.
>> You can check PIG-660 (in which we upgrade hadoop 18 to 20) to get some
>> idea.
>>
>> Daniel
>>
>> On Mon, Jan 9, 2012 at 2:01 AM, devdoer bird <de...@gmail.com> wrote:
>>
>> > If possible I want to use the official pig distribution, but the
>> company's
>> > hadoop (from hadoop 0.19) backend has changed a lot of api from the
>> > official one which I can't get the documentaion.
>> >
>> > I tried to fix the imcompatial problem   by making  pig 0.9.1  work with
>> > official hadoop 0.19 ,then work with my company's private hadoop
>> > distribution . But I don't where to begin.
>> >
>> >  Any help will be welcome.
>> >
>> > 2012/1/9 Daniel Dai <da...@hortonworks.com>
>> >
>> > > What error did you see when you use official Pig distribution? It's
>> > > non-trivial to write a backend even if possible (Especially for 0.5,
>> it
>> > > would be hard to get help from the community)
>> > >
>> > > On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com>
>> > wrote:
>> > >
>> > > > So I think Streaming backend may be a good solution for this
>> situation.
>> > > >
>> > > > 2012/1/9 devdoer bird <de...@gmail.com>
>> > > >
>> > > > > The reason I decide use Streaming as backend is that  In my
>> company
>> > the
>> > > > > hadoop has been modified  so it might not be compatible with the
>> > > official
>> > > > > hadoop distribution. And I can't make  the pig run on our private
>> > > hadoop
>> > > > > distribution.
>> > > > >
>> > > > >
>> > > > > 2012/1/9 Daniel Dai <da...@hortonworks.com>
>> > > > >
>> > > > >> Pig do have a abstraction layer in execution engine. But that is
>> > > mostly
>> > > > a
>> > > > >> legacy of early versions. In recent development, we never keep
>> > > platform
>> > > > >> neutral in mind so I don't know how reliable this interface is.
>> Can
>> > > you
>> > > > >> elaborate your idea so we may find a better solution?
>> > > > >>
>> > > > >> Daniel
>> > > > >>
>> > > > >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <devdoer2@gmail.com
>> >
>> > > > wrote:
>> > > > >>
>> > > > >> > HI:
>> > > > >> >
>> > > > >> > I want to implement a new pig backend  . Can I replache the
>> hadoop
>> > > > >> backend
>> > > > >> > with a hadoop--streaming only backend?
>> > > > >> >
>> > > > >> > I decide to use streaming to implement backend.storage and
>> backend
>> > > > >> > .executionengine interface , but I want to know   whether it's
>> a
>> > > right
>> > > > >> way
>> > > > >> > to do so.
>> > > > >> >
>> > > > >> > Thanks.
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
I had a patch that introduced shims, which included shims to make Pig work
with 19. It's woefully out of date at this point, but you can at least use
it as a starting point: https://issues.apache.org/jira/browse/PIG-924

On Mon, Jan 9, 2012 at 11:23 AM, Daniel Dai <da...@hortonworks.com> wrote:

> There are some API incompatible changes between 0.19 and 0.20. You will
> need to change current Pig code in order to compile/run against 0.19. But
> the cost of downgrade hadoop should be much lower than write a new backend.
> You can check PIG-660 (in which we upgrade hadoop 18 to 20) to get some
> idea.
>
> Daniel
>
> On Mon, Jan 9, 2012 at 2:01 AM, devdoer bird <de...@gmail.com> wrote:
>
> > If possible I want to use the official pig distribution, but the
> company's
> > hadoop (from hadoop 0.19) backend has changed a lot of api from the
> > official one which I can't get the documentaion.
> >
> > I tried to fix the imcompatial problem   by making  pig 0.9.1  work with
> > official hadoop 0.19 ,then work with my company's private hadoop
> > distribution . But I don't where to begin.
> >
> >  Any help will be welcome.
> >
> > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> >
> > > What error did you see when you use official Pig distribution? It's
> > > non-trivial to write a backend even if possible (Especially for 0.5, it
> > > would be hard to get help from the community)
> > >
> > > On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com>
> > wrote:
> > >
> > > > So I think Streaming backend may be a good solution for this
> situation.
> > > >
> > > > 2012/1/9 devdoer bird <de...@gmail.com>
> > > >
> > > > > The reason I decide use Streaming as backend is that  In my company
> > the
> > > > > hadoop has been modified  so it might not be compatible with the
> > > official
> > > > > hadoop distribution. And I can't make  the pig run on our private
> > > hadoop
> > > > > distribution.
> > > > >
> > > > >
> > > > > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> > > > >
> > > > >> Pig do have a abstraction layer in execution engine. But that is
> > > mostly
> > > > a
> > > > >> legacy of early versions. In recent development, we never keep
> > > platform
> > > > >> neutral in mind so I don't know how reliable this interface is.
> Can
> > > you
> > > > >> elaborate your idea so we may find a better solution?
> > > > >>
> > > > >> Daniel
> > > > >>
> > > > >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com>
> > > > wrote:
> > > > >>
> > > > >> > HI:
> > > > >> >
> > > > >> > I want to implement a new pig backend  . Can I replache the
> hadoop
> > > > >> backend
> > > > >> > with a hadoop--streaming only backend?
> > > > >> >
> > > > >> > I decide to use streaming to implement backend.storage and
> backend
> > > > >> > .executionengine interface , but I want to know   whether it's a
> > > right
> > > > >> way
> > > > >> > to do so.
> > > > >> >
> > > > >> > Thanks.
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Daniel Dai <da...@hortonworks.com>.
There are some API incompatible changes between 0.19 and 0.20. You will
need to change current Pig code in order to compile/run against 0.19. But
the cost of downgrade hadoop should be much lower than write a new backend.
You can check PIG-660 (in which we upgrade hadoop 18 to 20) to get some
idea.

Daniel

On Mon, Jan 9, 2012 at 2:01 AM, devdoer bird <de...@gmail.com> wrote:

> If possible I want to use the official pig distribution, but the company's
> hadoop (from hadoop 0.19) backend has changed a lot of api from the
> official one which I can't get the documentaion.
>
> I tried to fix the imcompatial problem   by making  pig 0.9.1  work with
> official hadoop 0.19 ,then work with my company's private hadoop
> distribution . But I don't where to begin.
>
>  Any help will be welcome.
>
> 2012/1/9 Daniel Dai <da...@hortonworks.com>
>
> > What error did you see when you use official Pig distribution? It's
> > non-trivial to write a backend even if possible (Especially for 0.5, it
> > would be hard to get help from the community)
> >
> > On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com>
> wrote:
> >
> > > So I think Streaming backend may be a good solution for this situation.
> > >
> > > 2012/1/9 devdoer bird <de...@gmail.com>
> > >
> > > > The reason I decide use Streaming as backend is that  In my company
> the
> > > > hadoop has been modified  so it might not be compatible with the
> > official
> > > > hadoop distribution. And I can't make  the pig run on our private
> > hadoop
> > > > distribution.
> > > >
> > > >
> > > > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> > > >
> > > >> Pig do have a abstraction layer in execution engine. But that is
> > mostly
> > > a
> > > >> legacy of early versions. In recent development, we never keep
> > platform
> > > >> neutral in mind so I don't know how reliable this interface is. Can
> > you
> > > >> elaborate your idea so we may find a better solution?
> > > >>
> > > >> Daniel
> > > >>
> > > >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com>
> > > wrote:
> > > >>
> > > >> > HI:
> > > >> >
> > > >> > I want to implement a new pig backend  . Can I replache the hadoop
> > > >> backend
> > > >> > with a hadoop--streaming only backend?
> > > >> >
> > > >> > I decide to use streaming to implement backend.storage and backend
> > > >> > .executionengine interface , but I want to know   whether it's a
> > right
> > > >> way
> > > >> > to do so.
> > > >> >
> > > >> > Thanks.
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
If possible I want to use the official pig distribution, but the company's
hadoop (from hadoop 0.19) backend has changed a lot of api from the
official one which I can't get the documentaion.

I tried to fix the imcompatial problem   by making  pig 0.9.1  work with
official hadoop 0.19 ,then work with my company's private hadoop
distribution . But I don't where to begin.

 Any help will be welcome.

2012/1/9 Daniel Dai <da...@hortonworks.com>

> What error did you see when you use official Pig distribution? It's
> non-trivial to write a backend even if possible (Especially for 0.5, it
> would be hard to get help from the community)
>
> On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com> wrote:
>
> > So I think Streaming backend may be a good solution for this situation.
> >
> > 2012/1/9 devdoer bird <de...@gmail.com>
> >
> > > The reason I decide use Streaming as backend is that  In my company the
> > > hadoop has been modified  so it might not be compatible with the
> official
> > > hadoop distribution. And I can't make  the pig run on our private
> hadoop
> > > distribution.
> > >
> > >
> > > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> > >
> > >> Pig do have a abstraction layer in execution engine. But that is
> mostly
> > a
> > >> legacy of early versions. In recent development, we never keep
> platform
> > >> neutral in mind so I don't know how reliable this interface is. Can
> you
> > >> elaborate your idea so we may find a better solution?
> > >>
> > >> Daniel
> > >>
> > >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com>
> > wrote:
> > >>
> > >> > HI:
> > >> >
> > >> > I want to implement a new pig backend  . Can I replache the hadoop
> > >> backend
> > >> > with a hadoop--streaming only backend?
> > >> >
> > >> > I decide to use streaming to implement backend.storage and backend
> > >> > .executionengine interface , but I want to know   whether it's a
> right
> > >> way
> > >> > to do so.
> > >> >
> > >> > Thanks.
> > >> >
> > >>
> > >
> > >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Daniel Dai <da...@hortonworks.com>.
What error did you see when you use official Pig distribution? It's
non-trivial to write a backend even if possible (Especially for 0.5, it
would be hard to get help from the community)

On Mon, Jan 9, 2012 at 12:16 AM, devdoer bird <de...@gmail.com> wrote:

> So I think Streaming backend may be a good solution for this situation.
>
> 2012/1/9 devdoer bird <de...@gmail.com>
>
> > The reason I decide use Streaming as backend is that  In my company the
> > hadoop has been modified  so it might not be compatible with the official
> > hadoop distribution. And I can't make  the pig run on our private hadoop
> > distribution.
> >
> >
> > 2012/1/9 Daniel Dai <da...@hortonworks.com>
> >
> >> Pig do have a abstraction layer in execution engine. But that is mostly
> a
> >> legacy of early versions. In recent development, we never keep platform
> >> neutral in mind so I don't know how reliable this interface is. Can you
> >> elaborate your idea so we may find a better solution?
> >>
> >> Daniel
> >>
> >> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com>
> wrote:
> >>
> >> > HI:
> >> >
> >> > I want to implement a new pig backend  . Can I replache the hadoop
> >> backend
> >> > with a hadoop--streaming only backend?
> >> >
> >> > I decide to use streaming to implement backend.storage and backend
> >> > .executionengine interface , but I want to know   whether it's a right
> >> way
> >> > to do so.
> >> >
> >> > Thanks.
> >> >
> >>
> >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
So I think Streaming backend may be a good solution for this situation.

2012/1/9 devdoer bird <de...@gmail.com>

> The reason I decide use Streaming as backend is that  In my company the
> hadoop has been modified  so it might not be compatible with the official
> hadoop distribution. And I can't make  the pig run on our private hadoop
> distribution.
>
>
> 2012/1/9 Daniel Dai <da...@hortonworks.com>
>
>> Pig do have a abstraction layer in execution engine. But that is mostly a
>> legacy of early versions. In recent development, we never keep platform
>> neutral in mind so I don't know how reliable this interface is. Can you
>> elaborate your idea so we may find a better solution?
>>
>> Daniel
>>
>> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com> wrote:
>>
>> > HI:
>> >
>> > I want to implement a new pig backend  . Can I replache the hadoop
>> backend
>> > with a hadoop--streaming only backend?
>> >
>> > I decide to use streaming to implement backend.storage and backend
>> > .executionengine interface , but I want to know   whether it's a right
>> way
>> > to do so.
>> >
>> > Thanks.
>> >
>>
>
>

Re: Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
The reason I decide use Streaming as backend is that  In my company the
hadoop has been modified  so it might not be compatible with the official
hadoop distribution. And I can't make  the pig run on our private hadoop
distribution.

2012/1/9 Daniel Dai <da...@hortonworks.com>

> Pig do have a abstraction layer in execution engine. But that is mostly a
> legacy of early versions. In recent development, we never keep platform
> neutral in mind so I don't know how reliable this interface is. Can you
> elaborate your idea so we may find a better solution?
>
> Daniel
>
> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com> wrote:
>
> > HI:
> >
> > I want to implement a new pig backend  . Can I replache the hadoop
> backend
> > with a hadoop--streaming only backend?
> >
> > I decide to use streaming to implement backend.storage and backend
> > .executionengine interface , but I want to know   whether it's a right
> way
> > to do so.
> >
> > Thanks.
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Daniel Dai <da...@hortonworks.com>.
In 0.5 we do have two backend: hadoop and local, so in theory it is
possible to write a different backend. But you need to be aware that 0.5 is
out of support for a long time, and miss a lot of features.

Daniel

On Mon, Jan 9, 2012 at 12:03 AM, devdoer bird <de...@gmail.com> wrote:

> Thanks.
>
> Which version keep this legacy?  I'm trying add this feature to  Pig 0.5.
>
> 2012/1/9 Daniel Dai <da...@hortonworks.com>
>
> > Pig do have a abstraction layer in execution engine. But that is mostly a
> > legacy of early versions. In recent development, we never keep platform
> > neutral in mind so I don't know how reliable this interface is. Can you
> > elaborate your idea so we may find a better solution?
> >
> > Daniel
> >
> > On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com> wrote:
> >
> > > HI:
> > >
> > > I want to implement a new pig backend  . Can I replache the hadoop
> > backend
> > > with a hadoop--streaming only backend?
> > >
> > > I decide to use streaming to implement backend.storage and backend
> > > .executionengine interface , but I want to know   whether it's a right
> > way
> > > to do so.
> > >
> > > Thanks.
> > >
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
Thanks.

Which version keep this legacy?  I'm trying add this feature to  Pig 0.5.

2012/1/9 Daniel Dai <da...@hortonworks.com>

> Pig do have a abstraction layer in execution engine. But that is mostly a
> legacy of early versions. In recent development, we never keep platform
> neutral in mind so I don't know how reliable this interface is. Can you
> elaborate your idea so we may find a better solution?
>
> Daniel
>
> On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com> wrote:
>
> > HI:
> >
> > I want to implement a new pig backend  . Can I replache the hadoop
> backend
> > with a hadoop--streaming only backend?
> >
> > I decide to use streaming to implement backend.storage and backend
> > .executionengine interface , but I want to know   whether it's a right
> way
> > to do so.
> >
> > Thanks.
> >
>

Re: Can I replace the pig backend with hadoop streming?

Posted by Daniel Dai <da...@hortonworks.com>.
Pig do have a abstraction layer in execution engine. But that is mostly a
legacy of early versions. In recent development, we never keep platform
neutral in mind so I don't know how reliable this interface is. Can you
elaborate your idea so we may find a better solution?

Daniel

On Sun, Jan 8, 2012 at 7:50 PM, devdoer bird <de...@gmail.com> wrote:

> HI:
>
> I want to implement a new pig backend  . Can I replache the hadoop backend
> with a hadoop--streaming only backend?
>
> I decide to use streaming to implement backend.storage and backend
> .executionengine interface , but I want to know   whether it's a right way
> to do so.
>
> Thanks.
>

Can I replace the pig backend with hadoop streming?

Posted by devdoer bird <de...@gmail.com>.
HI:

I want to implement a new pig backend  . Can I replache the hadoop backend
with a hadoop--streaming only backend?

I decide to use streaming to implement backend.storage and backend
.executionengine interface , but I want to know   whether it's a right way
to do so.

Thanks.