You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sher Khan <wh...@gmail.com> on 2008/02/23 11:37:35 UTC

newbie question... please help.

hi brothers,
i am completely confused about the hadoop usage / deployment etc.

yes, i did read the documentation & other details on apach foundation site
yet i am a dumb a** n perhaps hence am still confused all the more.

pray allow me to explain my situation in detail.

i have different clusters of servers to perform www + application + database
+ storage functions @ my work.

the entire architecture is running on win2k3 enterprise server + mssql 2k5 +
asp.net 2.0 for the frontend.

i was wondering if i cud be in a position to use / deploy hadoop @ my
scenario & if i can how?

if i am not mistaken, hadoop is a distributed file system [similar to ntfs,
hpfs, fat32, etc. except for its far too advanced & is capable of storing
the data residing on it in "map/reduce" kind of architecture] & if i need to
use hadoop @ my location, i shall need to engineer my applications ground up
on php / java, mysql [or some other open source compliant db] & linux as my
o/s & storage system?

can some kind soul please help me understand the concept &/or guide me to a
better drafter reference resources please?

thanks in advance for the help bro's.

Re: newbie question... please help.

Posted by Sher Khan <wh...@gmail.com>.
thanks a truckload for the response. guess there is no shortcut to it. i
will start learning the linux a.s.a.p. n try to get this thing right. will
post more dumb questions here! thanks again!

On Sun, Feb 24, 2008 at 2:07 AM, Jeff Eastman <je...@collab.net> wrote:

> If your main question is "can I host my mssql database on the Hadoop
> DFS?", then the answer is no. The DFS is designed for large files that
> are write once, read multiple and a database engine would want to update
> the files.
>
> If, OTOH, your question is "can I move (some of) my mssql database into
> Hadoop so I can run some map/reduce jobs against it?", then the answer
> is yes, keep on reading.
>
> Deploying Hadoop on Win2k is reputedly possible using Cygwin, but more
> complicated than just using Linux directly. You are going to have to
> learn your way around Linux one way or the other, so you might as well
> take the easier path. It will be an adventure well worth your time.
>
> Jeff
>
> -----Original Message-----
> From: Sher Khan [mailto:whymail@gmail.com]
> Sent: Saturday, February 23, 2008 9:29 AM
> To: core-user@hadoop.apache.org
> Subject: Re: newbie question... please help.
>
> thanks for the quick note.
> problem is, i am a complete rookie @ linux + related technologies so
> will
> have to read n reread stuff a lot - which i am ready for - so will study
> the
> wiki more carefully.
>
> can u just respond in a lil detail that if hadoop is a dfs & it runs on
> or
> has the ability of map/reduce paradigms, can i host any of my databases
> on
> to it? or is it that since it does a map/reduce, setting a db on hadoof
> dfs
> shall prove to be entirely useless?
>
> i know this might sound completely stoopid for all u experts however its
> just my bad that i started on this late :(
>
> please help!
>
> On Sat, Feb 23, 2008 at 8:09 PM, 11 Nov. <no...@gmail.com> wrote:
>
> > Hadoop is DFS+MapReduce, which two concepts are totally independent.
> > I think you can start with hadoop wiki and the tutorial there.
> >
> > 2008/2/23, Sher Khan <wh...@gmail.com>:
> > >
> > > hi brothers,
> > > i am completely confused about the hadoop usage / deployment etc.
> > >
> > > yes, i did read the documentation & other details on apach
> foundation
> > site
> > > yet i am a dumb a** n perhaps hence am still confused all the more.
> > >
> > > pray allow me to explain my situation in detail.
> > >
> > > i have different clusters of servers to perform www + application +
> > > database
> > > + storage functions @ my work.
> > >
> > > the entire architecture is running on win2k3 enterprise server +
> mssql
> > 2k5
> > > +
> > > asp.net 2.0 for the frontend.
> > >
> > > i was wondering if i cud be in a position to use / deploy hadoop @
> my
> > > scenario & if i can how?
> > >
> > > if i am not mistaken, hadoop is a distributed file system [similar
> to
> > > ntfs,
> > > hpfs, fat32, etc. except for its far too advanced & is capable of
> > storing
> > > the data residing on it in "map/reduce" kind of architecture] & if i
> > need
> > > to
> > > use hadoop @ my location, i shall need to engineer my applications
> > ground
> > > up
> > > on php / java, mysql [or some other open source compliant db] &
> linux as
> > > my
> > > o/s & storage system?
> > >
> > > can some kind soul please help me understand the concept &/or guide
> me
> > to
> > > a
> > > better drafter reference resources please?
> > >
> > > thanks in advance for the help bro's.
> > >
> >
>

RE: newbie question... please help.

Posted by Jeff Eastman <je...@collab.net>.
If your main question is "can I host my mssql database on the Hadoop
DFS?", then the answer is no. The DFS is designed for large files that
are write once, read multiple and a database engine would want to update
the files. 

If, OTOH, your question is "can I move (some of) my mssql database into
Hadoop so I can run some map/reduce jobs against it?", then the answer
is yes, keep on reading.

Deploying Hadoop on Win2k is reputedly possible using Cygwin, but more
complicated than just using Linux directly. You are going to have to
learn your way around Linux one way or the other, so you might as well
take the easier path. It will be an adventure well worth your time.

Jeff

-----Original Message-----
From: Sher Khan [mailto:whymail@gmail.com] 
Sent: Saturday, February 23, 2008 9:29 AM
To: core-user@hadoop.apache.org
Subject: Re: newbie question... please help.

thanks for the quick note.
problem is, i am a complete rookie @ linux + related technologies so
will
have to read n reread stuff a lot - which i am ready for - so will study
the
wiki more carefully.

can u just respond in a lil detail that if hadoop is a dfs & it runs on
or
has the ability of map/reduce paradigms, can i host any of my databases
on
to it? or is it that since it does a map/reduce, setting a db on hadoof
dfs
shall prove to be entirely useless?

i know this might sound completely stoopid for all u experts however its
just my bad that i started on this late :(

please help!

On Sat, Feb 23, 2008 at 8:09 PM, 11 Nov. <no...@gmail.com> wrote:

> Hadoop is DFS+MapReduce, which two concepts are totally independent.
> I think you can start with hadoop wiki and the tutorial there.
>
> 2008/2/23, Sher Khan <wh...@gmail.com>:
> >
> > hi brothers,
> > i am completely confused about the hadoop usage / deployment etc.
> >
> > yes, i did read the documentation & other details on apach
foundation
> site
> > yet i am a dumb a** n perhaps hence am still confused all the more.
> >
> > pray allow me to explain my situation in detail.
> >
> > i have different clusters of servers to perform www + application +
> > database
> > + storage functions @ my work.
> >
> > the entire architecture is running on win2k3 enterprise server +
mssql
> 2k5
> > +
> > asp.net 2.0 for the frontend.
> >
> > i was wondering if i cud be in a position to use / deploy hadoop @
my
> > scenario & if i can how?
> >
> > if i am not mistaken, hadoop is a distributed file system [similar
to
> > ntfs,
> > hpfs, fat32, etc. except for its far too advanced & is capable of
> storing
> > the data residing on it in "map/reduce" kind of architecture] & if i
> need
> > to
> > use hadoop @ my location, i shall need to engineer my applications
> ground
> > up
> > on php / java, mysql [or some other open source compliant db] &
linux as
> > my
> > o/s & storage system?
> >
> > can some kind soul please help me understand the concept &/or guide
me
> to
> > a
> > better drafter reference resources please?
> >
> > thanks in advance for the help bro's.
> >
>

Re: newbie question... please help.

Posted by Sher Khan <wh...@gmail.com>.
thanks for the quick note.
problem is, i am a complete rookie @ linux + related technologies so will
have to read n reread stuff a lot - which i am ready for - so will study the
wiki more carefully.

can u just respond in a lil detail that if hadoop is a dfs & it runs on or
has the ability of map/reduce paradigms, can i host any of my databases on
to it? or is it that since it does a map/reduce, setting a db on hadoof dfs
shall prove to be entirely useless?

i know this might sound completely stoopid for all u experts however its
just my bad that i started on this late :(

please help!

On Sat, Feb 23, 2008 at 8:09 PM, 11 Nov. <no...@gmail.com> wrote:

> Hadoop is DFS+MapReduce, which two concepts are totally independent.
> I think you can start with hadoop wiki and the tutorial there.
>
> 2008/2/23, Sher Khan <wh...@gmail.com>:
> >
> > hi brothers,
> > i am completely confused about the hadoop usage / deployment etc.
> >
> > yes, i did read the documentation & other details on apach foundation
> site
> > yet i am a dumb a** n perhaps hence am still confused all the more.
> >
> > pray allow me to explain my situation in detail.
> >
> > i have different clusters of servers to perform www + application +
> > database
> > + storage functions @ my work.
> >
> > the entire architecture is running on win2k3 enterprise server + mssql
> 2k5
> > +
> > asp.net 2.0 for the frontend.
> >
> > i was wondering if i cud be in a position to use / deploy hadoop @ my
> > scenario & if i can how?
> >
> > if i am not mistaken, hadoop is a distributed file system [similar to
> > ntfs,
> > hpfs, fat32, etc. except for its far too advanced & is capable of
> storing
> > the data residing on it in "map/reduce" kind of architecture] & if i
> need
> > to
> > use hadoop @ my location, i shall need to engineer my applications
> ground
> > up
> > on php / java, mysql [or some other open source compliant db] & linux as
> > my
> > o/s & storage system?
> >
> > can some kind soul please help me understand the concept &/or guide me
> to
> > a
> > better drafter reference resources please?
> >
> > thanks in advance for the help bro's.
> >
>

Re: newbie question... please help.

Posted by "11 Nov." <no...@gmail.com>.
Hadoop is DFS+MapReduce, which two concepts are totally independent.
I think you can start with hadoop wiki and the tutorial there.

2008/2/23, Sher Khan <wh...@gmail.com>:
>
> hi brothers,
> i am completely confused about the hadoop usage / deployment etc.
>
> yes, i did read the documentation & other details on apach foundation site
> yet i am a dumb a** n perhaps hence am still confused all the more.
>
> pray allow me to explain my situation in detail.
>
> i have different clusters of servers to perform www + application +
> database
> + storage functions @ my work.
>
> the entire architecture is running on win2k3 enterprise server + mssql 2k5
> +
> asp.net 2.0 for the frontend.
>
> i was wondering if i cud be in a position to use / deploy hadoop @ my
> scenario & if i can how?
>
> if i am not mistaken, hadoop is a distributed file system [similar to
> ntfs,
> hpfs, fat32, etc. except for its far too advanced & is capable of storing
> the data residing on it in "map/reduce" kind of architecture] & if i need
> to
> use hadoop @ my location, i shall need to engineer my applications ground
> up
> on php / java, mysql [or some other open source compliant db] & linux as
> my
> o/s & storage system?
>
> can some kind soul please help me understand the concept &/or guide me to
> a
> better drafter reference resources please?
>
> thanks in advance for the help bro's.
>