You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by GO-HADOOP <om...@gmail.com> on 2008/11/01 18:53:27 UTC

How to read mapreduce output in HDFS directory from Web Application

I am new to HADOOP, i am trying to understand what is the efficient method to
read the output file from HDFS and display the result in simple web
application?

Thanks

-- 
View this message in context: http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20282762.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by GO-HADOOP <om...@gmail.com>.

I think i found what i was looking for regarding HBASE. Thanks
-- 
View this message in context: http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20294221.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by GO-HADOOP <om...@gmail.com>.

Thank you Jean & Alex for your response.

Alex, If there are some example or sample implementation accessing HBase
data from web application with you can you please share it with me? 

What I am looking and trying to solve is realtime datamining from the web
application become awfully slow in commercial RDBMS when the size of data
reaches terrabytes and the volume of data that a query has to work on become
significantly large. I am not sure HADOOP and HBASE implementation help me
in this area so doing kind of research utilizing the technology.

As i am new in this technology i may be trying to do what this technology is
not intended for, your view in this matter will be really helpful.

Regards
-- 
View this message in context: http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20292281.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Alex,

It's a HBase design goal to be able to answer to live non-relational
queries. True, up to 0.18, performance was not the priority but 0.19 will be
MUCH faster. Also, more and more websites use HBase in a production
environment, see http://wiki.apache.org/hadoop/Hbase/PoweredBy

Regards the JDBC connectivity, it's not currently in HBase's roadmap since
the data model is not relational.

J-D

On Sun, Nov 2, 2008 at 12:06 PM, Alex Loddengaard <al...@cloudera.com> wrote:

> Someone else correct me if I'm wrong, but I don't think HBASE queries run
> nearly fast enough to be displayed on a website.  You would see long load
> times, and hence create a bad user experience.
> Agreed that you should definitely be concerned with MySQL tables becoming
> insanely large.  MySQL is really fast for simple, non-join queries, so
> perhaps you could output your data in a non-relational way, allowing MySQL
> to not spend time on joins.  You may also just simply have to upgrade your
> DB machine, or create a partitioning scheme.  With lots of data comes lots
> hardware, unfortunately.
>
> Alex
>
> On Sun, Nov 2, 2008 at 6:30 AM, GO-HADOOP <om...@gmail.com> wrote:
>
> >
> > Thank you for your response. One more question in the same topic is
> whether
> > there is possibility of using HBASE tables, in other word is there a way
> to
> > connect to HBASE table using jdbc connection and fetch the result from
> > HBASE
> > table to display as output in the web application?
> >
> > MYSQL option is ok, my concern is what if the output of the map/reduce
> > become so large and i run into similar performance issues using other
> > RDBMS.
> >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20290241.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
>

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Alex,

It's a HBase design goal to be able to answer to live non-relational
queries. True, up to 0.18, performance was not the priority but 0.19 will be
MUCH faster. Also, more and more websites use HBase in a production
environment, see http://wiki.apache.org/hadoop/Hbase/PoweredBy

Regards the JDBC connectivity, it's not currently in HBase's roadmap since
the data model is not relational.

J-D

On Sun, Nov 2, 2008 at 12:06 PM, Alex Loddengaard <al...@cloudera.com> wrote:

> Someone else correct me if I'm wrong, but I don't think HBASE queries run
> nearly fast enough to be displayed on a website.  You would see long load
> times, and hence create a bad user experience.
> Agreed that you should definitely be concerned with MySQL tables becoming
> insanely large.  MySQL is really fast for simple, non-join queries, so
> perhaps you could output your data in a non-relational way, allowing MySQL
> to not spend time on joins.  You may also just simply have to upgrade your
> DB machine, or create a partitioning scheme.  With lots of data comes lots
> hardware, unfortunately.
>
> Alex
>
> On Sun, Nov 2, 2008 at 6:30 AM, GO-HADOOP <om...@gmail.com> wrote:
>
> >
> > Thank you for your response. One more question in the same topic is
> whether
> > there is possibility of using HBASE tables, in other word is there a way
> to
> > connect to HBASE table using jdbc connection and fetch the result from
> > HBASE
> > table to display as output in the web application?
> >
> > MYSQL option is ok, my concern is what if the output of the map/reduce
> > become so large and i run into similar performance issues using other
> > RDBMS.
> >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20290241.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
>

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by Alex Loddengaard <al...@cloudera.com>.

Someone else correct me if I'm wrong, but I don't think HBASE queries run
nearly fast enough to be displayed on a website.  You would see long load
times, and hence create a bad user experience.
Agreed that you should definitely be concerned with MySQL tables becoming
insanely large.  MySQL is really fast for simple, non-join queries, so
perhaps you could output your data in a non-relational way, allowing MySQL
to not spend time on joins.  You may also just simply have to upgrade your
DB machine, or create a partitioning scheme.  With lots of data comes lots
hardware, unfortunately.

Alex

On Sun, Nov 2, 2008 at 6:30 AM, GO-HADOOP <om...@gmail.com> wrote:

>
> Thank you for your response. One more question in the same topic is whether
> there is possibility of using HBASE tables, in other word is there a way to
> connect to HBASE table using jdbc connection and fetch the result from
> HBASE
> table to display as output in the web application?
>
> MYSQL option is ok, my concern is what if the output of the map/reduce
> become so large and i run into similar performance issues using other
> RDBMS.
>
>
> --
> View this message in context:
> http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20290241.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by GO-HADOOP <om...@gmail.com>.

Thank you for your response. One more question in the same topic is whether
there is possibility of using HBASE tables, in other word is there a way to
connect to HBASE table using jdbc connection and fetch the result from HBASE
table to display as output in the web application?

MYSQL option is ok, my concern is what if the output of the map/reduce
become so large and i run into similar performance issues using other RDBMS.


-- 
View this message in context: http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20290241.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: How to read mapreduce output in HDFS directory from Web Application

Posted by Alex Loddengaard <al...@cloudera.com>.

I suppose it depends on what you're trying to do.  One approach would be to
output SQL insert statements and import them in to a database that a web app
could query.  On the other hand, you could output XML or JSON that can be
queried by an AJAX app.  Read more about MySQL connectivity here:
<https://issues.apache.org/jira/browse/HADOOP-2536>
<http://issues.apache.org/jira/browse/HADOOP-4331>

Alex

On Sat, Nov 1, 2008 at 10:53 AM, GO-HADOOP <om...@gmail.com> wrote:

>
> I am new to HADOOP, i am trying to understand what is the efficient method
> to
> read the output file from HDFS and display the result in simple web
> application?
>
> Thanks
>
> --
> View this message in context:
> http://www.nabble.com/How-to-read-mapreduce-output-in-HDFS-directory-from-Web-Application-tp20282762p20282762.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>