You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Cagdas Gerede <ca...@gmail.com> on 2008/04/15 20:21:28 UTC

multiple datanodes in the same machine

Is there a way to run multiple datanodes in the same machine?


-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

Re: multiple datanodes in the same machine

Posted by Ted Dunning <td...@veoh.com>.
And the two instances will affect each other significantly so that they will
tend to serialize.


On 4/15/08 3:24 PM, "Theodore Van Rooy" <mu...@gmail.com> wrote:

> " Why do you want to do this perverse thing?"
> 
> -agreed.
> 
> It sounds like even in your testing that you'll not really get the full
> effect of what you're wanting to test.  When you have two installations on
> the same machine it's likely that the network latency and other issues that
> occur when transferring between nodes wont really be tested.
> 
> 
> On Tue, Apr 15, 2008 at 4:12 PM, Cagdas Gerede <ca...@gmail.com>
> wrote:
> 
>> I am working on Distributed File System part. I do not use MR part,
>> and I need to run multiple processes to test some scenarios on the file
>> system.
>> 
>> On Tue, Apr 15, 2008 at 1:37 PM, Ted Dunning <td...@veoh.com> wrote:
>> 
>>> 
>>> I have had no issues in scaling the number of datanodes.  The location
>> of
>>> the data is almost invisible to MR programs.
>>> 
>>> I have had issues in going from local to distributed mode, but that has
>>> entirely been due to class path like issues.  Since MR naturally
>> restricts
>>> your focus, it is pretty much the rule that programs scale without much
>>> thought.
>>> 
>>> If you test with two tasktrackers and one data node, you should have a
>>> pretty solid test environment.
>>> 
>>> 
>>> On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" <ca...@gmail.com>
>>>  wrote:
>>> 
>>>> Testing when I do not have 10 machines.
>>>> 
>>>> 
>>>> On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
>>>>> 
>>>>> Why do you want to do this perverse thing?
>>>>> 
>>>>> How does it help to have more than one datanode per machine?  And
>> what
>>> in
>>>>> the world is better when you have 10?
>>>>> 
>>>>> 
>>>>> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
>>>>> 
>>>>>> I have a follow-up question,
>>>>>> Is there a way to programatically configure datanode parameters and
>>> start
>>>>>> the datanode process?
>>>>>> If I want to create 10 datanodes on the same host, do I have to
>> create
>>> 10
>>>>>> config files?
>>>>>> 
>>>>>> 
>>>>>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <
>>> dhruba@yahoo-inc.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Yes, just point the Datanodes to different config files, different
>>> sets
>>>>>>> of ports, different data directories. Etc.etc.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> dhruba
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
>>>>>>> Sent: Tuesday, April 15, 2008 11:21 AM
>>>>>>> To: core-user@hadoop.apache.org
>>>>>>> Subject: multiple datanodes in the same machine
>>>>>>> 
>>>>>>> Is there a way to run multiple datanodes in the same machine?
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> ------------
>>>>>>> Best Regards, Cagdas Evren Gerede
>>>>>>> Home Page: http://cagdasgerede.info
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> --
>> ------------
>> Best Regards, Cagdas Evren Gerede
>> Home Page: http://cagdasgerede.info
>> 
> 
> 


Re: multiple datanodes in the same machine

Posted by Theodore Van Rooy <mu...@gmail.com>.
" Why do you want to do this perverse thing?"

-agreed.

It sounds like even in your testing that you'll not really get the full
effect of what you're wanting to test.  When you have two installations on
the same machine it's likely that the network latency and other issues that
occur when transferring between nodes wont really be tested.


On Tue, Apr 15, 2008 at 4:12 PM, Cagdas Gerede <ca...@gmail.com>
wrote:

> I am working on Distributed File System part. I do not use MR part,
> and I need to run multiple processes to test some scenarios on the file
> system.
>
> On Tue, Apr 15, 2008 at 1:37 PM, Ted Dunning <td...@veoh.com> wrote:
>
> >
> > I have had no issues in scaling the number of datanodes.  The location
> of
> > the data is almost invisible to MR programs.
> >
> > I have had issues in going from local to distributed mode, but that has
> > entirely been due to class path like issues.  Since MR naturally
> restricts
> > your focus, it is pretty much the rule that programs scale without much
> > thought.
> >
> > If you test with two tasktrackers and one data node, you should have a
> > pretty solid test environment.
> >
> >
> > On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" <ca...@gmail.com>
> >  wrote:
> >
> > > Testing when I do not have 10 machines.
> > >
> > >
> > > On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
> > >>
> > >> Why do you want to do this perverse thing?
> > >>
> > >> How does it help to have more than one datanode per machine?  And
> what
> > in
> > >> the world is better when you have 10?
> > >>
> > >>
> > >> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
> > >>
> > >>> I have a follow-up question,
> > >>> Is there a way to programatically configure datanode parameters and
> > start
> > >>> the datanode process?
> > >>> If I want to create 10 datanodes on the same host, do I have to
> create
> > 10
> > >>> config files?
> > >>>
> > >>>
> > >>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <
> > dhruba@yahoo-inc.com>
> > >>> wrote:
> > >>>
> > >>>> Yes, just point the Datanodes to different config files, different
> > sets
> > >>>> of ports, different data directories. Etc.etc.
> > >>>>
> > >>>> Thanks,
> > >>>> dhruba
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
> > >>>> Sent: Tuesday, April 15, 2008 11:21 AM
> > >>>> To: core-user@hadoop.apache.org
> > >>>> Subject: multiple datanodes in the same machine
> > >>>>
> > >>>> Is there a way to run multiple datanodes in the same machine?
> > >>>>
> > >>>>
> > >>>> --
> > >>>> ------------
> > >>>> Best Regards, Cagdas Evren Gerede
> > >>>> Home Page: http://cagdasgerede.info
> > >>>>
> > >>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>
>
> --
> ------------
> Best Regards, Cagdas Evren Gerede
> Home Page: http://cagdasgerede.info
>



-- 
Theodore Van Rooy
http://greentheo.scroggles.com

Re: multiple datanodes in the same machine

Posted by Ted Dunning <td...@veoh.com>.
Then the answer is yes.  You need 10 configuration directories.


On 4/15/08 3:12 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:

> I am working on Distributed File System part. I do not use MR part,
> and I need to run multiple processes to test some scenarios on the file
> system.
> 
> On Tue, Apr 15, 2008 at 1:37 PM, Ted Dunning <td...@veoh.com> wrote:
> 
>> 
>> I have had no issues in scaling the number of datanodes.  The location of
>> the data is almost invisible to MR programs.
>> 
>> I have had issues in going from local to distributed mode, but that has
>> entirely been due to class path like issues.  Since MR naturally restricts
>> your focus, it is pretty much the rule that programs scale without much
>> thought.
>> 
>> If you test with two tasktrackers and one data node, you should have a
>> pretty solid test environment.
>> 
>> 
>> On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" <ca...@gmail.com>
>>  wrote:
>> 
>>> Testing when I do not have 10 machines.
>>> 
>>> 
>>> On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
>>>> 
>>>> Why do you want to do this perverse thing?
>>>> 
>>>> How does it help to have more than one datanode per machine?  And what
>> in
>>>> the world is better when you have 10?
>>>> 
>>>> 
>>>> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
>>>> 
>>>>> I have a follow-up question,
>>>>> Is there a way to programatically configure datanode parameters and
>> start
>>>>> the datanode process?
>>>>> If I want to create 10 datanodes on the same host, do I have to create
>> 10
>>>>> config files?
>>>>> 
>>>>> 
>>>>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <
>> dhruba@yahoo-inc.com>
>>>>> wrote:
>>>>> 
>>>>>> Yes, just point the Datanodes to different config files, different
>> sets
>>>>>> of ports, different data directories. Etc.etc.
>>>>>> 
>>>>>> Thanks,
>>>>>> dhruba
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
>>>>>> Sent: Tuesday, April 15, 2008 11:21 AM
>>>>>> To: core-user@hadoop.apache.org
>>>>>> Subject: multiple datanodes in the same machine
>>>>>> 
>>>>>> Is there a way to run multiple datanodes in the same machine?
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> ------------
>>>>>> Best Regards, Cagdas Evren Gerede
>>>>>> Home Page: http://cagdasgerede.info
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
> 


Re: multiple datanodes in the same machine

Posted by Cagdas Gerede <ca...@gmail.com>.
I am working on Distributed File System part. I do not use MR part,
and I need to run multiple processes to test some scenarios on the file
system.

On Tue, Apr 15, 2008 at 1:37 PM, Ted Dunning <td...@veoh.com> wrote:

>
> I have had no issues in scaling the number of datanodes.  The location of
> the data is almost invisible to MR programs.
>
> I have had issues in going from local to distributed mode, but that has
> entirely been due to class path like issues.  Since MR naturally restricts
> your focus, it is pretty much the rule that programs scale without much
> thought.
>
> If you test with two tasktrackers and one data node, you should have a
> pretty solid test environment.
>
>
> On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" <ca...@gmail.com>
>  wrote:
>
> > Testing when I do not have 10 machines.
> >
> >
> > On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
> >>
> >> Why do you want to do this perverse thing?
> >>
> >> How does it help to have more than one datanode per machine?  And what
> in
> >> the world is better when you have 10?
> >>
> >>
> >> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
> >>
> >>> I have a follow-up question,
> >>> Is there a way to programatically configure datanode parameters and
> start
> >>> the datanode process?
> >>> If I want to create 10 datanodes on the same host, do I have to create
> 10
> >>> config files?
> >>>
> >>>
> >>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <
> dhruba@yahoo-inc.com>
> >>> wrote:
> >>>
> >>>> Yes, just point the Datanodes to different config files, different
> sets
> >>>> of ports, different data directories. Etc.etc.
> >>>>
> >>>> Thanks,
> >>>> dhruba
> >>>>
> >>>> -----Original Message-----
> >>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
> >>>> Sent: Tuesday, April 15, 2008 11:21 AM
> >>>> To: core-user@hadoop.apache.org
> >>>> Subject: multiple datanodes in the same machine
> >>>>
> >>>> Is there a way to run multiple datanodes in the same machine?
> >>>>
> >>>>
> >>>> --
> >>>> ------------
> >>>> Best Regards, Cagdas Evren Gerede
> >>>> Home Page: http://cagdasgerede.info
> >>>>
> >>>
> >>>
> >>
> >>
> >
>
>


-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

Re: multiple datanodes in the same machine

Posted by Ted Dunning <td...@veoh.com>.
I have had no issues in scaling the number of datanodes.  The location of
the data is almost invisible to MR programs.

I have had issues in going from local to distributed mode, but that has
entirely been due to class path like issues.  Since MR naturally restricts
your focus, it is pretty much the rule that programs scale without much
thought.

If you test with two tasktrackers and one data node, you should have a
pretty solid test environment.


On 4/15/08 1:12 PM, "cagdas.gerede@gmail.com" <ca...@gmail.com>
wrote:

> Testing when I do not have 10 machines.
> 
> 
> On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
>> 
>> Why do you want to do this perverse thing?
>> 
>> How does it help to have more than one datanode per machine?  And what in
>> the world is better when you have 10?
>> 
>> 
>> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
>> 
>>> I have a follow-up question,
>>> Is there a way to programatically configure datanode parameters and start
>>> the datanode process?
>>> If I want to create 10 datanodes on the same host, do I have to create 10
>>> config files?
>>> 
>>> 
>>> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <dh...@yahoo-inc.com>
>>> wrote:
>>> 
>>>> Yes, just point the Datanodes to different config files, different sets
>>>> of ports, different data directories. Etc.etc.
>>>> 
>>>> Thanks,
>>>> dhruba
>>>> 
>>>> -----Original Message-----
>>>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
>>>> Sent: Tuesday, April 15, 2008 11:21 AM
>>>> To: core-user@hadoop.apache.org
>>>> Subject: multiple datanodes in the same machine
>>>> 
>>>> Is there a way to run multiple datanodes in the same machine?
>>>> 
>>>> 
>>>> --
>>>> ------------
>>>> Best Regards, Cagdas Evren Gerede
>>>> Home Page: http://cagdasgerede.info
>>>> 
>>> 
>>> 
>> 
>> 
> 


Re: multiple datanodes in the same machine

Posted by ca...@gmail.com.
Testing when I do not have 10 machines.


On 4/15/08, Ted Dunning <td...@veoh.com> wrote:
>
> Why do you want to do this perverse thing?
>
> How does it help to have more than one datanode per machine?  And what in
> the world is better when you have 10?
>
>
> On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:
>
> > I have a follow-up question,
> > Is there a way to programatically configure datanode parameters and start
> > the datanode process?
> > If I want to create 10 datanodes on the same host, do I have to create 10
> > config files?
> >
> >
> > On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <dh...@yahoo-inc.com>
> > wrote:
> >
> >> Yes, just point the Datanodes to different config files, different sets
> >> of ports, different data directories. Etc.etc.
> >>
> >> Thanks,
> >> dhruba
> >>
> >> -----Original Message-----
> >> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
> >> Sent: Tuesday, April 15, 2008 11:21 AM
> >> To: core-user@hadoop.apache.org
> >> Subject: multiple datanodes in the same machine
> >>
> >> Is there a way to run multiple datanodes in the same machine?
> >>
> >>
> >> --
> >> ------------
> >> Best Regards, Cagdas Evren Gerede
> >> Home Page: http://cagdasgerede.info
> >>
> >
> >
>
>


-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

Re: multiple datanodes in the same machine

Posted by Ted Dunning <td...@veoh.com>.
Why do you want to do this perverse thing?

How does it help to have more than one datanode per machine?  And what in
the world is better when you have 10?


On 4/15/08 12:53 PM, "Cagdas Gerede" <ca...@gmail.com> wrote:

> I have a follow-up question,
> Is there a way to programatically configure datanode parameters and start
> the datanode process?
> If I want to create 10 datanodes on the same host, do I have to create 10
> config files?
> 
> 
> On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <dh...@yahoo-inc.com>
> wrote:
> 
>> Yes, just point the Datanodes to different config files, different sets
>> of ports, different data directories. Etc.etc.
>> 
>> Thanks,
>> dhruba
>> 
>> -----Original Message-----
>> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
>> Sent: Tuesday, April 15, 2008 11:21 AM
>> To: core-user@hadoop.apache.org
>> Subject: multiple datanodes in the same machine
>> 
>> Is there a way to run multiple datanodes in the same machine?
>> 
>> 
>> --
>> ------------
>> Best Regards, Cagdas Evren Gerede
>> Home Page: http://cagdasgerede.info
>> 
> 
> 


Re: multiple datanodes in the same machine

Posted by Cagdas Gerede <ca...@gmail.com>.
I have a follow-up question,
Is there a way to programatically configure datanode parameters and start
the datanode process?
If I want to create 10 datanodes on the same host, do I have to create 10
config files?


On Tue, Apr 15, 2008 at 12:29 PM, dhruba Borthakur <dh...@yahoo-inc.com>
wrote:

> Yes, just point the Datanodes to different config files, different sets
> of ports, different data directories. Etc.etc.
>
> Thanks,
> dhruba
>
> -----Original Message-----
> From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com]
> Sent: Tuesday, April 15, 2008 11:21 AM
> To: core-user@hadoop.apache.org
> Subject: multiple datanodes in the same machine
>
> Is there a way to run multiple datanodes in the same machine?
>
>
> --
> ------------
> Best Regards, Cagdas Evren Gerede
> Home Page: http://cagdasgerede.info
>



-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

RE: multiple datanodes in the same machine

Posted by dhruba Borthakur <dh...@yahoo-inc.com>.
Yes, just point the Datanodes to different config files, different sets
of ports, different data directories. Etc.etc.

Thanks,
dhruba

-----Original Message-----
From: Cagdas Gerede [mailto:cagdas.gerede@gmail.com] 
Sent: Tuesday, April 15, 2008 11:21 AM
To: core-user@hadoop.apache.org
Subject: multiple datanodes in the same machine

Is there a way to run multiple datanodes in the same machine?


-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info