You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Aditya Singh30 <Ad...@infosys.com> on 2011/10/05 12:13:10 UTC
Hadoop : Linux-Window interface
Hi,
We want to use Hadoop and Hive to store and analyze some Web Servers' Log files. The servers are running on windows platform. As mentioned about Hadoop, it is only supported for development on windows. I wanted to know is there a way that we can run the Hadoop server(namenode server) and cluster nodes on Linux, and have an interface using which we can send files and run analysis queries from the WebServer's windows environment.
I would really appreciate if you could point me to a right direction.
Regards,
Aditya Singh
Infosys. India
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
Re: Hadoop : Linux-Window interface
Posted by "Periya.Data" <pe...@gmail.com>.
Hi Aditya,
You may want to investigate about using Flume...that is designed to
collect unstructured data from disparate sources and store them in HDFS (or
directly into HIVE tables). I do not know if Flume provides interoperability
with Window's systems (maybe you hack it and make it work with Cygwin...).
http://archive.cloudera.com/cdh/3/flume/Cookbook/
-PD.
On Wed, Oct 5, 2011 at 8:14 AM, Bejoy KS <be...@gmail.com> wrote:
> Hi Aditya
> Definitely you can do it. As a very basic solution you can ftp the
> contents to LFS(LOCAL/Linux File System ) and they do a copyFromLocal into
> HDFS. Create a hive table with appropriate regex support and load the data
> in. Hive has classes that effectively support parsing and loading of Apache
> log files into hive tables.
> For the entite data transfer,you just need to write a shell script for the
> same. Log analysis won't be real time right? So you can schedule the job
> with some scheduler libe a cron or to be used in conjuction with hadoop
> jobs you can use some workflow management within hadoop eco ecosystem.
>
>
> On Wed, Oct 5, 2011 at 3:43 PM, Aditya Singh30
> <Ad...@infosys.com>wrote:
>
> > Hi,
> >
> > We want to use Hadoop and Hive to store and analyze some Web Servers' Log
> > files. The servers are running on windows platform. As mentioned about
> > Hadoop, it is only supported for development on windows. I wanted to know
> is
> > there a way that we can run the Hadoop server(namenode server) and
> cluster
> > nodes on Linux, and have an interface using which we can send files and
> run
> > analysis queries from the WebServer's windows environment.
> > I would really appreciate if you could point me to a right direction.
> >
> >
> > Regards,
> > Aditya Singh
> > Infosys. India
> >
> >
> > **************** CAUTION - Disclaimer *****************
> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> > solely
> > for the use of the addressee(s). If you are not the intended recipient,
> > please
> > notify the sender by e-mail and delete the original message. Further, you
> > are not
> > to copy, disclose, or distribute this e-mail or its contents to any other
> > person and
> > any such actions are unlawful. This e-mail may contain viruses. Infosys
> has
> > taken
> > every reasonable precaution to minimize this risk, but is not liable for
> > any damage
> > you may sustain as a result of any virus in this e-mail. You should carry
> > out your
> > own virus checks before opening the e-mail or attachment. Infosys
> reserves
> > the
> > right to monitor and review the content of all messages sent to or from
> > this e-mail
> > address. Messages sent to or from this e-mail address may be stored on
> the
> > Infosys e-mail system.
> > ***INFOSYS******** End of Disclaimer ********INFOSYS***
> >
>
Re: Hadoop : Linux-Window interface
Posted by Bejoy KS <be...@gmail.com>.
Hi Aditya
Definitely you can do it. As a very basic solution you can ftp the
contents to LFS(LOCAL/Linux File System ) and they do a copyFromLocal into
HDFS. Create a hive table with appropriate regex support and load the data
in. Hive has classes that effectively support parsing and loading of Apache
log files into hive tables.
For the entite data transfer,you just need to write a shell script for the
same. Log analysis won't be real time right? So you can schedule the job
with some scheduler libe a cron or to be used in conjuction with hadoop
jobs you can use some workflow management within hadoop eco ecosystem.
On Wed, Oct 5, 2011 at 3:43 PM, Aditya Singh30
<Ad...@infosys.com>wrote:
> Hi,
>
> We want to use Hadoop and Hive to store and analyze some Web Servers' Log
> files. The servers are running on windows platform. As mentioned about
> Hadoop, it is only supported for development on windows. I wanted to know is
> there a way that we can run the Hadoop server(namenode server) and cluster
> nodes on Linux, and have an interface using which we can send files and run
> analysis queries from the WebServer's windows environment.
> I would really appreciate if you could point me to a right direction.
>
>
> Regards,
> Aditya Singh
> Infosys. India
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>