You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Pei Zhao <pe...@gmail.com> on 2015/10/02 21:00:57 UTC
Hadoop streaming with HBase as data sink
Hi all,
I am a graduate student do research in solar cell data management. My project is using Hadoop/HBase. Recently we switch MapReduce to Python using Hadoop streaming.
My question is can I use Hadoop streaming, which outputs to stdout with a specific format that HBase can pick it up and put them into tables?
For instance,
If I output lines of RowKey\t\Column\tValue , then HBase can know how to put this into tables.
Regards
Pei
HBase Meetup on 8-Oct/Thu in Bangalore
Posted by Priyank rastogi <pr...@huawei.com>.
Hi,
There is an HBase meetup planned on 8-Oct/Thursday in Bangalore.
So if you are interested and plan to attend, you can register at this URL.
https://www.eventbrite.com/e/advanced-hbase-meetup-tickets-18697793604
Regards
Priyank
HBase Meetup on 8-Oct/Thu in Bangalore
Posted by Priyank rastogi <pr...@huawei.com>.
Hi,
There is an HBase meetup planned on 8-Oct/Thursday in Bangalore.
So if you are interested and plan to attend, you can register at this URL.
https://www.eventbrite.com/e/advanced-hbase-meetup-tickets-18697793604
Regards
Priyank
Re: Hadoop streaming with HBase as data sink
Posted by Nick Dimiduk <nd...@gmail.com>.
Maybe run more thrift gateways? Maybe one on each host running map tasks
and have the tasks talk to localhost. That way your job doesn't bottleneck
through a single thrift server.
> Solar Cell Data Management sounds cool.
+1 :)
On Fri, Oct 2, 2015 at 1:19 PM, Stack <st...@duboce.net> wrote:
> On Fri, Oct 2, 2015 at 12:46 PM, Pei Zhao <pe...@gmail.com> wrote:
>
> > Hi Stack,
> >
> > In my case, I tried to use HBase Thrift API, but Thrift server sometimes
> > crashes during my MapReduce job, due to out of heap memory.
> >
> > Do you have any suggestions on that please?
> >
> >
> Give it more heap?
>
> What is your client?
>
> Solar Cell Data Management sounds cool.
>
> St.Ack
>
>
>
>
> > Thanks
> >
> > On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:
> >
> > > You can't do hadoop streaming into hbase. Maybe explore hbase REST
> > > interface and see if you can format puts that hbase REST can digest.
> > > St.Ack
> > >
> > > On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I am a graduate student do research in solar cell data management. My
> > > > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> > > using
> > > > Hadoop streaming.
> > > >
> > > > My question is can I use Hadoop streaming, which outputs to stdout
> > with a
> > > > specific format that HBase can pick it up and put them into tables?
> > > >
> > > > For instance,
> > > >
> > > > If I output lines of RowKey\t\Column\tValue , then HBase can know how
> > to
> > > > put this into tables.
> > > >
> > > > Regards
> > > > Pei
> > >
> >
> >
> >
> > --
> > *Pei (Asher) Zhao*
> > *Electrical Engineering and Computer Science*
> > *Case Western Reserve University*
> > *Cleveland, Ohio 44106*
> >
> > *--The man who has made up his mind to win will never say impossible.*
> >
>
Re: Hadoop streaming with HBase as data sink
Posted by Stack <st...@duboce.net>.
On Fri, Oct 2, 2015 at 12:46 PM, Pei Zhao <pe...@gmail.com> wrote:
> Hi Stack,
>
> In my case, I tried to use HBase Thrift API, but Thrift server sometimes
> crashes during my MapReduce job, due to out of heap memory.
>
> Do you have any suggestions on that please?
>
>
Give it more heap?
What is your client?
Solar Cell Data Management sounds cool.
St.Ack
> Thanks
>
> On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:
>
> > You can't do hadoop streaming into hbase. Maybe explore hbase REST
> > interface and see if you can format puts that hbase REST can digest.
> > St.Ack
> >
> > On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I am a graduate student do research in solar cell data management. My
> > > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> > using
> > > Hadoop streaming.
> > >
> > > My question is can I use Hadoop streaming, which outputs to stdout
> with a
> > > specific format that HBase can pick it up and put them into tables?
> > >
> > > For instance,
> > >
> > > If I output lines of RowKey\t\Column\tValue , then HBase can know how
> to
> > > put this into tables.
> > >
> > > Regards
> > > Pei
> >
>
>
>
> --
> *Pei (Asher) Zhao*
> *Electrical Engineering and Computer Science*
> *Case Western Reserve University*
> *Cleveland, Ohio 44106*
>
> *--The man who has made up his mind to win will never say impossible.*
>
Re: Hadoop streaming with HBase as data sink
Posted by Pei Zhao <pe...@gmail.com>.
Hi Stack,
In my case, I tried to use HBase Thrift API, but Thrift server sometimes
crashes during my MapReduce job, due to out of heap memory.
Do you have any suggestions on that please?
Thanks
On Fri, Oct 2, 2015 at 3:10 PM, Stack <st...@duboce.net> wrote:
> You can't do hadoop streaming into hbase. Maybe explore hbase REST
> interface and see if you can format puts that hbase REST can digest.
> St.Ack
>
> On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:
>
> > Hi all,
> >
> > I am a graduate student do research in solar cell data management. My
> > project is using Hadoop/HBase. Recently we switch MapReduce to Python
> using
> > Hadoop streaming.
> >
> > My question is can I use Hadoop streaming, which outputs to stdout with a
> > specific format that HBase can pick it up and put them into tables?
> >
> > For instance,
> >
> > If I output lines of RowKey\t\Column\tValue , then HBase can know how to
> > put this into tables.
> >
> > Regards
> > Pei
>
--
*Pei (Asher) Zhao*
*Electrical Engineering and Computer Science*
*Case Western Reserve University*
*Cleveland, Ohio 44106*
*--The man who has made up his mind to win will never say impossible.*
Re: Hadoop streaming with HBase as data sink
Posted by Stack <st...@duboce.net>.
You can't do hadoop streaming into hbase. Maybe explore hbase REST
interface and see if you can format puts that hbase REST can digest.
St.Ack
On Fri, Oct 2, 2015 at 12:00 PM, Pei Zhao <pe...@gmail.com> wrote:
> Hi all,
>
> I am a graduate student do research in solar cell data management. My
> project is using Hadoop/HBase. Recently we switch MapReduce to Python using
> Hadoop streaming.
>
> My question is can I use Hadoop streaming, which outputs to stdout with a
> specific format that HBase can pick it up and put them into tables?
>
> For instance,
>
> If I output lines of RowKey\t\Column\tValue , then HBase can know how to
> put this into tables.
>
> Regards
> Pei