You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Mark Kerzner <ma...@shmsoft.com> on 2013/03/07 03:46:27 UTC

Best practices for adding services to Hadoop cluster?

Hi,

my Hadop cluster needs help: some tasks have to be done by a Windows server
with specialized closed-source software. How do I add them to the mix? For
example, I can run Tomcat, and the mapper would be calling a servlet there.
Is there anything better, which would be closer to the fault-tolerant
architecture of Hadoop itself?

Thank you,
Mark

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
True,

and I am building a distributed scalable one -- in other words, back to the
same problem, but on this end. But what can one do?

Cheers,
Mark

On Wed, Mar 6, 2013 at 11:59 PM, Harsh J <ha...@cloudera.com> wrote:

> The only thing wrong would be what is said for the DB-talking jobs as
> well: Distributed mappers talking to a single point of service can
> bring it down.
>
> On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Okay,
> >
> > then there is nothing wrong with the mapper directly talking to the
> server,
> > and failing the map task if the service does not work out.
> >
> > Thank you,
> > Mark
> >
> >
> > On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Can the mapper not directly talk to whatever application server the
> >> Windows server runs? Is the work needed to be done in the map step
> >> (i.e. per record)? If not, you can perhaps also consider the SSH
> >> action of Oozie (although I've never tried it with a Windows machine)
> >> under a workflow.
> >>
> >> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > my Hadop cluster needs help: some tasks have to be done by a Windows
> >> > server
> >> > with specialized closed-source software. How do I add them to the mix?
> >> > For
> >> > example, I can run Tomcat, and the mapper would be calling a servlet
> >> > there.
> >> > Is there anything better, which would be closer to the fault-tolerant
> >> > architecture of Hadoop itself?
> >> >
> >> > Thank you,
> >> > Mark
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
True,

and I am building a distributed scalable one -- in other words, back to the
same problem, but on this end. But what can one do?

Cheers,
Mark

On Wed, Mar 6, 2013 at 11:59 PM, Harsh J <ha...@cloudera.com> wrote:

> The only thing wrong would be what is said for the DB-talking jobs as
> well: Distributed mappers talking to a single point of service can
> bring it down.
>
> On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Okay,
> >
> > then there is nothing wrong with the mapper directly talking to the
> server,
> > and failing the map task if the service does not work out.
> >
> > Thank you,
> > Mark
> >
> >
> > On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Can the mapper not directly talk to whatever application server the
> >> Windows server runs? Is the work needed to be done in the map step
> >> (i.e. per record)? If not, you can perhaps also consider the SSH
> >> action of Oozie (although I've never tried it with a Windows machine)
> >> under a workflow.
> >>
> >> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > my Hadop cluster needs help: some tasks have to be done by a Windows
> >> > server
> >> > with specialized closed-source software. How do I add them to the mix?
> >> > For
> >> > example, I can run Tomcat, and the mapper would be calling a servlet
> >> > there.
> >> > Is there anything better, which would be closer to the fault-tolerant
> >> > architecture of Hadoop itself?
> >> >
> >> > Thank you,
> >> > Mark
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
True,

and I am building a distributed scalable one -- in other words, back to the
same problem, but on this end. But what can one do?

Cheers,
Mark

On Wed, Mar 6, 2013 at 11:59 PM, Harsh J <ha...@cloudera.com> wrote:

> The only thing wrong would be what is said for the DB-talking jobs as
> well: Distributed mappers talking to a single point of service can
> bring it down.
>
> On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Okay,
> >
> > then there is nothing wrong with the mapper directly talking to the
> server,
> > and failing the map task if the service does not work out.
> >
> > Thank you,
> > Mark
> >
> >
> > On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Can the mapper not directly talk to whatever application server the
> >> Windows server runs? Is the work needed to be done in the map step
> >> (i.e. per record)? If not, you can perhaps also consider the SSH
> >> action of Oozie (although I've never tried it with a Windows machine)
> >> under a workflow.
> >>
> >> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > my Hadop cluster needs help: some tasks have to be done by a Windows
> >> > server
> >> > with specialized closed-source software. How do I add them to the mix?
> >> > For
> >> > example, I can run Tomcat, and the mapper would be calling a servlet
> >> > there.
> >> > Is there anything better, which would be closer to the fault-tolerant
> >> > architecture of Hadoop itself?
> >> >
> >> > Thank you,
> >> > Mark
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
True,

and I am building a distributed scalable one -- in other words, back to the
same problem, but on this end. But what can one do?

Cheers,
Mark

On Wed, Mar 6, 2013 at 11:59 PM, Harsh J <ha...@cloudera.com> wrote:

> The only thing wrong would be what is said for the DB-talking jobs as
> well: Distributed mappers talking to a single point of service can
> bring it down.
>
> On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Okay,
> >
> > then there is nothing wrong with the mapper directly talking to the
> server,
> > and failing the map task if the service does not work out.
> >
> > Thank you,
> > Mark
> >
> >
> > On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Can the mapper not directly talk to whatever application server the
> >> Windows server runs? Is the work needed to be done in the map step
> >> (i.e. per record)? If not, you can perhaps also consider the SSH
> >> action of Oozie (although I've never tried it with a Windows machine)
> >> under a workflow.
> >>
> >> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > my Hadop cluster needs help: some tasks have to be done by a Windows
> >> > server
> >> > with specialized closed-source software. How do I add them to the mix?
> >> > For
> >> > example, I can run Tomcat, and the mapper would be calling a servlet
> >> > there.
> >> > Is there anything better, which would be closer to the fault-tolerant
> >> > architecture of Hadoop itself?
> >> >
> >> > Thank you,
> >> > Mark
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
The only thing wrong would be what is said for the DB-talking jobs as
well: Distributed mappers talking to a single point of service can
bring it down.

On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Okay,
>
> then there is nothing wrong with the mapper directly talking to the server,
> and failing the map task if the service does not work out.
>
> Thank you,
> Mark
>
>
> On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Can the mapper not directly talk to whatever application server the
>> Windows server runs? Is the work needed to be done in the map step
>> (i.e. per record)? If not, you can perhaps also consider the SSH
>> action of Oozie (although I've never tried it with a Windows machine)
>> under a workflow.
>>
>> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
>> wrote:
>> > Hi,
>> >
>> > my Hadop cluster needs help: some tasks have to be done by a Windows
>> > server
>> > with specialized closed-source software. How do I add them to the mix?
>> > For
>> > example, I can run Tomcat, and the mapper would be calling a servlet
>> > there.
>> > Is there anything better, which would be closer to the fault-tolerant
>> > architecture of Hadoop itself?
>> >
>> > Thank you,
>> > Mark
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
The only thing wrong would be what is said for the DB-talking jobs as
well: Distributed mappers talking to a single point of service can
bring it down.

On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Okay,
>
> then there is nothing wrong with the mapper directly talking to the server,
> and failing the map task if the service does not work out.
>
> Thank you,
> Mark
>
>
> On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Can the mapper not directly talk to whatever application server the
>> Windows server runs? Is the work needed to be done in the map step
>> (i.e. per record)? If not, you can perhaps also consider the SSH
>> action of Oozie (although I've never tried it with a Windows machine)
>> under a workflow.
>>
>> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
>> wrote:
>> > Hi,
>> >
>> > my Hadop cluster needs help: some tasks have to be done by a Windows
>> > server
>> > with specialized closed-source software. How do I add them to the mix?
>> > For
>> > example, I can run Tomcat, and the mapper would be calling a servlet
>> > there.
>> > Is there anything better, which would be closer to the fault-tolerant
>> > architecture of Hadoop itself?
>> >
>> > Thank you,
>> > Mark
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
The only thing wrong would be what is said for the DB-talking jobs as
well: Distributed mappers talking to a single point of service can
bring it down.

On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Okay,
>
> then there is nothing wrong with the mapper directly talking to the server,
> and failing the map task if the service does not work out.
>
> Thank you,
> Mark
>
>
> On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Can the mapper not directly talk to whatever application server the
>> Windows server runs? Is the work needed to be done in the map step
>> (i.e. per record)? If not, you can perhaps also consider the SSH
>> action of Oozie (although I've never tried it with a Windows machine)
>> under a workflow.
>>
>> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
>> wrote:
>> > Hi,
>> >
>> > my Hadop cluster needs help: some tasks have to be done by a Windows
>> > server
>> > with specialized closed-source software. How do I add them to the mix?
>> > For
>> > example, I can run Tomcat, and the mapper would be calling a servlet
>> > there.
>> > Is there anything better, which would be closer to the fault-tolerant
>> > architecture of Hadoop itself?
>> >
>> > Thank you,
>> > Mark
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
The only thing wrong would be what is said for the DB-talking jobs as
well: Distributed mappers talking to a single point of service can
bring it down.

On Thu, Mar 7, 2013 at 10:59 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Okay,
>
> then there is nothing wrong with the mapper directly talking to the server,
> and failing the map task if the service does not work out.
>
> Thank you,
> Mark
>
>
> On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Can the mapper not directly talk to whatever application server the
>> Windows server runs? Is the work needed to be done in the map step
>> (i.e. per record)? If not, you can perhaps also consider the SSH
>> action of Oozie (although I've never tried it with a Windows machine)
>> under a workflow.
>>
>> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
>> wrote:
>> > Hi,
>> >
>> > my Hadop cluster needs help: some tasks have to be done by a Windows
>> > server
>> > with specialized closed-source software. How do I add them to the mix?
>> > For
>> > example, I can run Tomcat, and the mapper would be calling a servlet
>> > there.
>> > Is there anything better, which would be closer to the fault-tolerant
>> > architecture of Hadoop itself?
>> >
>> > Thank you,
>> > Mark
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
Okay,

then there is nothing wrong with the mapper directly talking to the server,
and failing the map task if the service does not work out.

Thank you,
Mark

On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:

> Can the mapper not directly talk to whatever application server the
> Windows server runs? Is the work needed to be done in the map step
> (i.e. per record)? If not, you can perhaps also consider the SSH
> action of Oozie (although I've never tried it with a Windows machine)
> under a workflow.
>
> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Hi,
> >
> > my Hadop cluster needs help: some tasks have to be done by a Windows
> server
> > with specialized closed-source software. How do I add them to the mix?
> For
> > example, I can run Tomcat, and the mapper would be calling a servlet
> there.
> > Is there anything better, which would be closer to the fault-tolerant
> > architecture of Hadoop itself?
> >
> > Thank you,
> > Mark
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
Okay,

then there is nothing wrong with the mapper directly talking to the server,
and failing the map task if the service does not work out.

Thank you,
Mark

On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:

> Can the mapper not directly talk to whatever application server the
> Windows server runs? Is the work needed to be done in the map step
> (i.e. per record)? If not, you can perhaps also consider the SSH
> action of Oozie (although I've never tried it with a Windows machine)
> under a workflow.
>
> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Hi,
> >
> > my Hadop cluster needs help: some tasks have to be done by a Windows
> server
> > with specialized closed-source software. How do I add them to the mix?
> For
> > example, I can run Tomcat, and the mapper would be calling a servlet
> there.
> > Is there anything better, which would be closer to the fault-tolerant
> > architecture of Hadoop itself?
> >
> > Thank you,
> > Mark
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
Okay,

then there is nothing wrong with the mapper directly talking to the server,
and failing the map task if the service does not work out.

Thank you,
Mark

On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:

> Can the mapper not directly talk to whatever application server the
> Windows server runs? Is the work needed to be done in the map step
> (i.e. per record)? If not, you can perhaps also consider the SSH
> action of Oozie (although I've never tried it with a Windows machine)
> under a workflow.
>
> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Hi,
> >
> > my Hadop cluster needs help: some tasks have to be done by a Windows
> server
> > with specialized closed-source software. How do I add them to the mix?
> For
> > example, I can run Tomcat, and the mapper would be calling a servlet
> there.
> > Is there anything better, which would be closer to the fault-tolerant
> > architecture of Hadoop itself?
> >
> > Thank you,
> > Mark
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Mark Kerzner <ma...@shmsoft.com>.
Okay,

then there is nothing wrong with the mapper directly talking to the server,
and failing the map task if the service does not work out.

Thank you,
Mark

On Wed, Mar 6, 2013 at 11:21 PM, Harsh J <ha...@cloudera.com> wrote:

> Can the mapper not directly talk to whatever application server the
> Windows server runs? Is the work needed to be done in the map step
> (i.e. per record)? If not, you can perhaps also consider the SSH
> action of Oozie (although I've never tried it with a Windows machine)
> under a workflow.
>
> On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com>
> wrote:
> > Hi,
> >
> > my Hadop cluster needs help: some tasks have to be done by a Windows
> server
> > with specialized closed-source software. How do I add them to the mix?
> For
> > example, I can run Tomcat, and the mapper would be calling a servlet
> there.
> > Is there anything better, which would be closer to the fault-tolerant
> > architecture of Hadoop itself?
> >
> > Thank you,
> > Mark
>
>
>
> --
> Harsh J
>

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
Can the mapper not directly talk to whatever application server the
Windows server runs? Is the work needed to be done in the map step
(i.e. per record)? If not, you can perhaps also consider the SSH
action of Oozie (although I've never tried it with a Windows machine)
under a workflow.

On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Hi,
>
> my Hadop cluster needs help: some tasks have to be done by a Windows server
> with specialized closed-source software. How do I add them to the mix? For
> example, I can run Tomcat, and the mapper would be calling a servlet there.
> Is there anything better, which would be closer to the fault-tolerant
> architecture of Hadoop itself?
>
> Thank you,
> Mark



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
Can the mapper not directly talk to whatever application server the
Windows server runs? Is the work needed to be done in the map step
(i.e. per record)? If not, you can perhaps also consider the SSH
action of Oozie (although I've never tried it with a Windows machine)
under a workflow.

On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Hi,
>
> my Hadop cluster needs help: some tasks have to be done by a Windows server
> with specialized closed-source software. How do I add them to the mix? For
> example, I can run Tomcat, and the mapper would be calling a servlet there.
> Is there anything better, which would be closer to the fault-tolerant
> architecture of Hadoop itself?
>
> Thank you,
> Mark



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
Can the mapper not directly talk to whatever application server the
Windows server runs? Is the work needed to be done in the map step
(i.e. per record)? If not, you can perhaps also consider the SSH
action of Oozie (although I've never tried it with a Windows machine)
under a workflow.

On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Hi,
>
> my Hadop cluster needs help: some tasks have to be done by a Windows server
> with specialized closed-source software. How do I add them to the mix? For
> example, I can run Tomcat, and the mapper would be calling a servlet there.
> Is there anything better, which would be closer to the fault-tolerant
> architecture of Hadoop itself?
>
> Thank you,
> Mark



--
Harsh J

Re: Best practices for adding services to Hadoop cluster?

Posted by Harsh J <ha...@cloudera.com>.
Can the mapper not directly talk to whatever application server the
Windows server runs? Is the work needed to be done in the map step
(i.e. per record)? If not, you can perhaps also consider the SSH
action of Oozie (although I've never tried it with a Windows machine)
under a workflow.

On Thu, Mar 7, 2013 at 8:16 AM, Mark Kerzner <ma...@shmsoft.com> wrote:
> Hi,
>
> my Hadop cluster needs help: some tasks have to be done by a Windows server
> with specialized closed-source software. How do I add them to the mix? For
> example, I can run Tomcat, and the mapper would be calling a servlet there.
> Is there anything better, which would be closer to the fault-tolerant
> architecture of Hadoop itself?
>
> Thank you,
> Mark



--
Harsh J