You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Pei Zhao <pe...@gmail.com> on 2015/10/02 21:00:57 UTC

Hadoop streaming with HBase as data sink

Hi all,

I am a graduate student do research in solar cell data management. My project is using Hadoop/HBase. Recently we switch MapReduce to Python using Hadoop streaming. 

My question is can I use Hadoop streaming, which outputs to stdout with a specific format that HBase can pick it up and put them into tables?

For instance,

If I output lines of RowKey\t\Column\tValue , then HBase can know how to put this into tables.

Regards
Pei

HBase Meetup on 8-Oct/Thu in Bangalore

Posted by Priyank rastogi <pr...@huawei.com>.
Hi,

There is an HBase meetup planned on 8-Oct/Thursday in Bangalore.
So if you are interested and plan to attend, you can register at this URL.
  https://www.eventbrite.com/e/advanced-hbase-meetup-tickets-18697793604

Regards
Priyank

HBase Meetup on 8-Oct/Thu in Bangalore

Posted by Priyank rastogi <pr...@huawei.com>.
Hi,

There is an HBase meetup planned on 8-Oct/Thursday in Bangalore.
So if you are interested and plan to attend, you can register at this URL.
  https://www.eventbrite.com/e/advanced-hbase-meetup-tickets-18697793604

Regards
Priyank

Re: Hadoop streaming with HBase as data sink

Posted by Nick Dimiduk <nd...@gmail.com>.
Maybe run more thrift gateways? Maybe one on each host running map tasks
and have the tasks talk to localhost. That way your job doesn't bottleneck
through a single thrift server.

> Solar Cell Data Management sounds cool.

+1 :)

On Fri, Oct 2, 2015 at 1:19 PM, Stack <st...@duboce.net> wrote:

> On Fri, Oct 2, 2015 at 12:46 PM, Pei Zhao <pe...@gmail.com> wrote:
>
> > Hi Stack,
> >
> > In my case, I tried to use HBase Thrift API, but Thrift server sometimes
> > crashes during my MapReduce job, due to out of heap memory.
> >
> > Do you have any suggestions on that please?
> >
> >
> Give it more heap?
>
> What is your client?
>
> Solar Cell Data Management sounds cool.
>
> St.Ack
>
>
>
>
> > Thanks
> >
> > On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:
> >
> > > You can't do hadoop streaming into hbase. Maybe explore hbase REST
> > > interface and see if you can format puts that hbase REST can digest.
> > > St.Ack
> > >
> > > On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I am a graduate student do research in solar cell data management. My
> > > > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> > > using
> > > > Hadoop streaming.
> > > >
> > > > My question is can I use Hadoop streaming, which outputs to stdout
> > with a
> > > > specific format that HBase can pick it up and put them into tables?
> > > >
> > > > For instance,
> > > >
> > > > If I output lines of RowKey\t\Column\tValue , then HBase can know how
> > to
> > > > put this into tables.
> > > >
> > > > Regards
> > > > Pei
> > >
> >
> >
> >
> > --
> > *Pei (Asher) Zhao*
> > *Electrical Engineering and Computer Science*
> > *Case Western Reserve University*
> > *Cleveland, Ohio 44106*
> >
> > *--The man who has made up his mind to win will never say impossible.*
> >
>

Re: Hadoop streaming with HBase as data sink

Posted by Stack <st...@duboce.net>.
On Fri, Oct 2, 2015 at 12:46 PM, Pei Zhao <pe...@gmail.com> wrote:

> Hi Stack,
>
> In my case, I tried to use HBase Thrift API, but Thrift server sometimes
> crashes during my MapReduce job, due to out of heap memory.
>
> Do you have any suggestions on that please?
>
>
Give it more heap?

What is your client?

Solar Cell Data Management sounds cool.

St.Ack




> Thanks
>
> On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:
>
> > You can't do hadoop streaming into hbase. Maybe explore hbase REST
> > interface and see if you can format puts that hbase REST can digest.
> > St.Ack
> >
> > On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I am a graduate student do research in solar cell data management. My
> > > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> > using
> > > Hadoop streaming.
> > >
> > > My question is can I use Hadoop streaming, which outputs to stdout
> with a
> > > specific format that HBase can pick it up and put them into tables?
> > >
> > > For instance,
> > >
> > > If I output lines of RowKey\t\Column\tValue , then HBase can know how
> to
> > > put this into tables.
> > >
> > > Regards
> > > Pei
> >
>
>
>
> --
> *Pei (Asher) Zhao*
> *Electrical Engineering and Computer Science*
> *Case Western Reserve University*
> *Cleveland, Ohio 44106*
>
> *--The man who has made up his mind to win will never say impossible.*
>

Re: Hadoop streaming with HBase as data sink

Posted by Pei Zhao <pe...@gmail.com>.
Hi Stack,

In my case, I tried to use HBase Thrift API, but Thrift server sometimes
crashes during my MapReduce job, due to out of heap memory.

Do you have any suggestions on that please?

Thanks

On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:

> You can't do hadoop streaming into hbase. Maybe explore hbase REST
> interface and see if you can format puts that hbase REST can digest.
> St.Ack
>
> On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:
>
> > Hi all,
> >
> > I am a graduate student do research in solar cell data management. My
> > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> using
> > Hadoop streaming.
> >
> > My question is can I use Hadoop streaming, which outputs to stdout with a
> > specific format that HBase can pick it up and put them into tables?
> >
> > For instance,
> >
> > If I output lines of RowKey\t\Column\tValue , then HBase can know how to
> > put this into tables.
> >
> > Regards
> > Pei
>



-- 
*Pei (Asher) Zhao*
*Electrical Engineering and Computer Science*
*Case Western Reserve University*
*Cleveland, Ohio 44106*

*--The man who has made up his mind to win will never say impossible.*

Re: Hadoop streaming with HBase as data sink

Posted by Stack <st...@duboce.net>.
You can't do hadoop streaming into hbase. Maybe explore hbase REST
interface and see if you can format puts that hbase REST can digest.
St.Ack

On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:

> Hi all,
>
> I am a graduate student do research in solar cell data management. My
> project is using Hadoop/HBase. Recently we switch MapReduce to Python using
> Hadoop streaming.
>
> My question is can I use Hadoop streaming, which outputs to stdout with a
> specific format that HBase can pick it up and put them into tables?
>
> For instance,
>
> If I output lines of RowKey\t\Column\tValue , then HBase can know how to
> put this into tables.
>
> Regards
> Pei