You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Edward J. Yoon" <ed...@apache.org> on 2008/11/20 07:40:34 UTC

Bulk import question.

When I tried to bulk import, I received below error. (code is same
with hbase wiki)

08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%

Is it possible? And, hadoop/hbase daemons are crashed.

- hadoop-0.18.2 & hbase-0.18.1
- 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
- 2 node cluster


----
08/11/20 15:23:36 INFO mapred.JobClient:  map 57% reduce 0%
08/11/20 15:23:40 INFO mapred.JobClient:  map 59% reduce 0%
08/11/20 15:23:45 INFO mapred.JobClient:  map 60% reduce 0%
08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%
08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
attempt_200811131622_0019_m_000000_0, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server 61.247.201.164:60020 for region
mail,,1227162121175, row '?:', but failed after 10 attempts.
Exceptions:
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception
java.io.IOException: Call failed on local exception

        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
        at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
        at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
        at com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
        at com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Re[2]: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
Sure, only map code was changed. and the input data is a text file
from sqlite dump.

Input line : "999949|4217|0||1211920741.790266.26733.816999|0|Subject:cp949:Y
From:cp949:Y To:cp949:Y Cc::N Bcc::N ReplyTo::Y|..."

Map:

    public void map(LongWritable key, Text value,
        OutputCollector<Text, Text> output, Reporter reporter)
        throws IOException {
      if (table == null)
        throw new IOException("table is null");

      String[] splits = value.toString().split("[|]");
      BatchUpdate update = new BatchUpdate(Bytes.toBytes(Integer
          .parseInt(splits[0])));

      if (splits.length == 31) {
        for (int i = 0; i < splits.length; i++) {
          if (!splits[i].equals("")) {
            update.put(TEXT + fields[i], Bytes.toBytes(splits[i]));
          }
        }
        table.commit(update);
      }
    }


On Thu, Nov 20, 2008 at 3:53 PM, ROL <dm...@rol.ru> wrote:
> Hi
>
> Can you show source?
>
>> While job running, I scanned that table at the same time.
>
> --
> С уважением,
>  ROL                          mailto:dmkd@rol.ru
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re[2]: Bulk import question.

Posted by ROL <dm...@rol.ru>.
Hi

Can you show source?

> While job running, I scanned that table at the same time.

-- 
С уважением,
 ROL                          mailto:dmkd@rol.ru


Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
While job running, I scanned that table at the same time.

On Thu, Nov 20, 2008 at 3:40 PM, Edward J. Yoon <ed...@apache.org> wrote:
> When I tried to bulk import, I received below error. (code is same
> with hbase wiki)
>
> 08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%
>
> Is it possible? And, hadoop/hbase daemons are crashed.
>
> - hadoop-0.18.2 & hbase-0.18.1
> - 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
> - 2 node cluster
>
>
> ----
> 08/11/20 15:23:36 INFO mapred.JobClient:  map 57% reduce 0%
> 08/11/20 15:23:40 INFO mapred.JobClient:  map 59% reduce 0%
> 08/11/20 15:23:45 INFO mapred.JobClient:  map 60% reduce 0%
> 08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
> attempt_200811131622_0019_m_000000_0, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server 61.247.201.164:60020 for region
> mail,,1227162121175, row ' ?:', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
>
>        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
>        at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
>        at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
>        at com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
>        at com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
One more, I wonder that the more map task, the more faster on any condition.

/Edward

On Tue, Dec 2, 2008 at 2:32 PM, Andrew Purtell <ap...@yahoo.com> wrote:
> Hello Jonathan,
>
> +1 on sharing your Nagios plugins for Hadoop and HBase as a
> contrib. :-)
>
>   - Andy
>
>
>> From: Jonathan Gray <jl...@streamy.com>
>> Subject: RE: Bulk import question.
>> To: hbase-user@hadoop.apache.org
>> Date: Monday, December 1, 2008, 7:20 PM
>>
>> Your new best friends:  Ganglia and Nagios
>>
>> Ganglia is great for monitoring cluster-wide resource usage
>> over time.  You'll see memory, cpu, disk, network usage
>> over time for entire cluster and for each node.  It is very
>> easy to setup because it uses UDP broadcast so no need to
>> actually configure nodes in conf files.  HBase 0.19
>> introduces ganglia metrics which will also be available in
>> the ganglia web interface.
>>
>> http://ganglia.info/
>>
>> Nagios is good for monitoring services as well as resource
>> utilization.  Rather than give data over time, it's aim
>> is really to alert you when something is wrong.  For
>> example, when a server is no longer reachable or when
>> available disk space reaches a configurable threshold.  It
>> does require a bit more work to get up and running because
>> you have to setup your node and service configurations.  I
>> have written custom nagios plugins for hadoop and hbase, if
>> there's interest I will look at cleaning them up and
>> contrib'ing them.
>>
>> http://www.nagios.org/
>>
>> Both are free and essential tools for properly monitoring
>> your cluster.
>>
>> JG
>
>
>
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
Oh. ganglia seems great. :) Thanks.

On Tue, Dec 2, 2008 at 12:20 PM, Jonathan Gray <jl...@streamy.com> wrote:
> Your new best friends:  Ganglia and Nagios
>
> Ganglia is great for monitoring cluster-wide resource usage over time.  You'll see memory, cpu, disk, network usage over time for entire cluster and for each node.  It is very easy to setup because it uses UDP broadcast so no need to actually configure nodes in conf files.  HBase 0.19 introduces ganglia metrics which will also be available in the ganglia web interface.
>
> http://ganglia.info/
>
> Nagios is good for monitoring services as well as resource utilization.  Rather than give data over time, it's aim is really to alert you when something is wrong.  For example, when a server is no longer reachable or when available disk space reaches a configurable threshold.  It does require a bit more work to get up and running because you have to setup your node and service configurations.  I have written custom nagios plugins for hadoop and hbase, if there's interest I will look at cleaning them up and contrib'ing them.
>
> http://www.nagios.org/
>
> Both are free and essential tools for properly monitoring your cluster.
>
> JG
>
>> -----Original Message-----
>> From: edward@udanax.org [mailto:edward@udanax.org] On Behalf Of Edward
>> J. Yoon
>> Sent: Monday, December 01, 2008 7:04 PM
>> To: apurtell@apache.org
>> Cc: hbase-user@hadoop.apache.org; 02635@nhncorp.com
>> Subject: Re: Bulk import question.
>>
>> I'm considering to store the large-scale web-mail data on the Hbase.
>> As you know, there is a lot of mail bomb (e.g. spam, group mail,...,
>> etc). So, I tested these.
>>
>> Here's my additionally question. Have we a monitoring tool for disk
>> space?
>>
>> /Edward
>>
>> On Tue, Dec 2, 2008 at 11:42 AM, Andrew Purtell <ap...@apache.org>
>> wrote:
>> > Edward,
>> >
>> > You are running with insufficient resources -- too little CPU
>> > for your task and too little disk for your data.
>> >
>> > If you are running a mapreduce task and DFS runs out of space
>> > for the temporary files, then you indeed should expect
>> > aberrant job status from the Hadoop job framework, for
>> > example such things as completion status running backwards.
>> >
>> > I do agree that under these circumstances HBase daemons
>> > should fail more gracefully, by entering some kind of
>> > degraded read only mode, if DFS is not totally dead. I
>> > suspect this is already on a to do list somewhere, and I
>> > vaguely recall a jira filed on that topic.
>> >
>> >   - Andy
>> >
>> >
>> >> From: Edward J. Yoon <ed...@apache.org>
>> >> Subject: Re: Bulk import question.
>> >> To: hbase-user@hadoop.apache.org, apurtell@apache.org
>> >> Date: Monday, December 1, 2008, 6:26 PM
>> >> It was by 'Datanode DiskOutOfSpaceException'. But, I
>> >> think daemons should not dead.
>> >>
>> >> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
>> >> <ed...@apache.org> wrote:
>> >> > Hmm. It often occurs to me. I'll check the logs.
>> >> >
>> >> > On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
>> >> <ap...@yahoo.com> wrote:
>> >> > > I think a 2 node cluster is simply too small for
>> >> > > the full load of everything.
>> >> > >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon @ NHN, corp.
>> edwardyoon@apache.org
>> http://blog.udanax.org
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

RE: Bulk import question.

Posted by Andrew Purtell <ap...@yahoo.com>.
Hello Jonathan,

+1 on sharing your Nagios plugins for Hadoop and HBase as a
contrib. :-)

   - Andy


> From: Jonathan Gray <jl...@streamy.com>
> Subject: RE: Bulk import question.
> To: hbase-user@hadoop.apache.org
> Date: Monday, December 1, 2008, 7:20 PM
>
> Your new best friends:  Ganglia and Nagios
> 
> Ganglia is great for monitoring cluster-wide resource usage
> over time.  You'll see memory, cpu, disk, network usage
> over time for entire cluster and for each node.  It is very
> easy to setup because it uses UDP broadcast so no need to
> actually configure nodes in conf files.  HBase 0.19
> introduces ganglia metrics which will also be available in
> the ganglia web interface.
> 
> http://ganglia.info/
> 
> Nagios is good for monitoring services as well as resource
> utilization.  Rather than give data over time, it's aim
> is really to alert you when something is wrong.  For
> example, when a server is no longer reachable or when
> available disk space reaches a configurable threshold.  It
> does require a bit more work to get up and running because
> you have to setup your node and service configurations.  I
> have written custom nagios plugins for hadoop and hbase, if
> there's interest I will look at cleaning them up and
> contrib'ing them.
> 
> http://www.nagios.org/
> 
> Both are free and essential tools for properly monitoring
> your cluster.
> 
> JG



      

RE: Bulk import question.

Posted by Jonathan Gray <jl...@streamy.com>.
Your new best friends:  Ganglia and Nagios

Ganglia is great for monitoring cluster-wide resource usage over time.  You'll see memory, cpu, disk, network usage over time for entire cluster and for each node.  It is very easy to setup because it uses UDP broadcast so no need to actually configure nodes in conf files.  HBase 0.19 introduces ganglia metrics which will also be available in the ganglia web interface.

http://ganglia.info/

Nagios is good for monitoring services as well as resource utilization.  Rather than give data over time, it's aim is really to alert you when something is wrong.  For example, when a server is no longer reachable or when available disk space reaches a configurable threshold.  It does require a bit more work to get up and running because you have to setup your node and service configurations.  I have written custom nagios plugins for hadoop and hbase, if there's interest I will look at cleaning them up and contrib'ing them.

http://www.nagios.org/

Both are free and essential tools for properly monitoring your cluster.

JG

> -----Original Message-----
> From: edward@udanax.org [mailto:edward@udanax.org] On Behalf Of Edward
> J. Yoon
> Sent: Monday, December 01, 2008 7:04 PM
> To: apurtell@apache.org
> Cc: hbase-user@hadoop.apache.org; 02635@nhncorp.com
> Subject: Re: Bulk import question.
> 
> I'm considering to store the large-scale web-mail data on the Hbase.
> As you know, there is a lot of mail bomb (e.g. spam, group mail,...,
> etc). So, I tested these.
> 
> Here's my additionally question. Have we a monitoring tool for disk
> space?
> 
> /Edward
> 
> On Tue, Dec 2, 2008 at 11:42 AM, Andrew Purtell <ap...@apache.org>
> wrote:
> > Edward,
> >
> > You are running with insufficient resources -- too little CPU
> > for your task and too little disk for your data.
> >
> > If you are running a mapreduce task and DFS runs out of space
> > for the temporary files, then you indeed should expect
> > aberrant job status from the Hadoop job framework, for
> > example such things as completion status running backwards.
> >
> > I do agree that under these circumstances HBase daemons
> > should fail more gracefully, by entering some kind of
> > degraded read only mode, if DFS is not totally dead. I
> > suspect this is already on a to do list somewhere, and I
> > vaguely recall a jira filed on that topic.
> >
> >   - Andy
> >
> >
> >> From: Edward J. Yoon <ed...@apache.org>
> >> Subject: Re: Bulk import question.
> >> To: hbase-user@hadoop.apache.org, apurtell@apache.org
> >> Date: Monday, December 1, 2008, 6:26 PM
> >> It was by 'Datanode DiskOutOfSpaceException'. But, I
> >> think daemons should not dead.
> >>
> >> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
> >> <ed...@apache.org> wrote:
> >> > Hmm. It often occurs to me. I'll check the logs.
> >> >
> >> > On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
> >> <ap...@yahoo.com> wrote:
> >> > > I think a 2 node cluster is simply too small for
> >> > > the full load of everything.
> >> > >
> >
> >
> >
> >
> >
> 
> 
> 
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org


Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
> Let us know how else we can help along your project.

Yup, Thanks. :)

On Tue, Dec 2, 2008 at 1:50 PM, Michael Stack <st...@duboce.net> wrote:
> There is none in hbase; it doesn't manage the filesystem so doesn't make the
> best sense adding it there (We could add it as a metric I suppose).  In hdfs
> there are facilities for asking that it only fill a percentage or an
> explicit amount of the allocated space -- see hadoop-default.xml.  I'm not
> sure how well these work.
>
> Would suggest that you consider the advice given by the lads -- jgray on
> how-to cluster monitor (including disk usage) and apurtell on not-enough
> resources -- if you want to get serious about your cluster.
>
> Let us know how else we can help along your project.
>
> St.Ack
>
>
>
> Edward J. Yoon wrote:
>>
>> I'm considering to store the large-scale web-mail data on the Hbase.
>> As you know, there is a lot of mail bomb (e.g. spam, group mail,...,
>> etc). So, I tested these.
>>
>> Here's my additionally question. Have we a monitoring tool for disk space?
>>
>> /Edward
>>
>> On Tue, Dec 2, 2008 at 11:42 AM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>
>>>
>>> Edward,
>>>
>>> You are running with insufficient resources -- too little CPU
>>> for your task and too little disk for your data.
>>>
>>> If you are running a mapreduce task and DFS runs out of space
>>> for the temporary files, then you indeed should expect
>>> aberrant job status from the Hadoop job framework, for
>>> example such things as completion status running backwards.
>>>
>>> I do agree that under these circumstances HBase daemons
>>> should fail more gracefully, by entering some kind of
>>> degraded read only mode, if DFS is not totally dead. I
>>> suspect this is already on a to do list somewhere, and I
>>> vaguely recall a jira filed on that topic.
>>>
>>>  - Andy
>>>
>>>
>>>
>>>>
>>>> From: Edward J. Yoon <ed...@apache.org>
>>>> Subject: Re: Bulk import question.
>>>> To: hbase-user@hadoop.apache.org, apurtell@apache.org
>>>> Date: Monday, December 1, 2008, 6:26 PM
>>>> It was by 'Datanode DiskOutOfSpaceException'. But, I
>>>> think daemons should not dead.
>>>>
>>>> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
>>>> <ed...@apache.org> wrote:
>>>>
>>>>>
>>>>> Hmm. It often occurs to me. I'll check the logs.
>>>>>
>>>>> On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
>>>>>
>>>>
>>>> <ap...@yahoo.com> wrote:
>>>>
>>>>>>
>>>>>> I think a 2 node cluster is simply too small for
>>>>>> the full load of everything.
>>>>>>
>>>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by Michael Stack <st...@duboce.net>.
There is none in hbase; it doesn't manage the filesystem so doesn't make 
the best sense adding it there (We could add it as a metric I suppose).  
In hdfs there are facilities for asking that it only fill a percentage 
or an explicit amount of the allocated space -- see hadoop-default.xml.  
I'm not sure how well these work.

Would suggest that you consider the advice given by the lads -- jgray on 
how-to cluster monitor (including disk usage) and apurtell on not-enough 
resources -- if you want to get serious about your cluster.

Let us know how else we can help along your project.

St.Ack



Edward J. Yoon wrote:
> I'm considering to store the large-scale web-mail data on the Hbase.
> As you know, there is a lot of mail bomb (e.g. spam, group mail,...,
> etc). So, I tested these.
>
> Here's my additionally question. Have we a monitoring tool for disk space?
>
> /Edward
>
> On Tue, Dec 2, 2008 at 11:42 AM, Andrew Purtell <ap...@apache.org> wrote:
>   
>> Edward,
>>
>> You are running with insufficient resources -- too little CPU
>> for your task and too little disk for your data.
>>
>> If you are running a mapreduce task and DFS runs out of space
>> for the temporary files, then you indeed should expect
>> aberrant job status from the Hadoop job framework, for
>> example such things as completion status running backwards.
>>
>> I do agree that under these circumstances HBase daemons
>> should fail more gracefully, by entering some kind of
>> degraded read only mode, if DFS is not totally dead. I
>> suspect this is already on a to do list somewhere, and I
>> vaguely recall a jira filed on that topic.
>>
>>   - Andy
>>
>>
>>     
>>> From: Edward J. Yoon <ed...@apache.org>
>>> Subject: Re: Bulk import question.
>>> To: hbase-user@hadoop.apache.org, apurtell@apache.org
>>> Date: Monday, December 1, 2008, 6:26 PM
>>> It was by 'Datanode DiskOutOfSpaceException'. But, I
>>> think daemons should not dead.
>>>
>>> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
>>> <ed...@apache.org> wrote:
>>>       
>>>> Hmm. It often occurs to me. I'll check the logs.
>>>>
>>>> On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
>>>>         
>>> <ap...@yahoo.com> wrote:
>>>       
>>>>> I think a 2 node cluster is simply too small for
>>>>> the full load of everything.
>>>>>
>>>>>           
>>
>>
>>
>>     
>
>
>
>   


Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
I'm considering to store the large-scale web-mail data on the Hbase.
As you know, there is a lot of mail bomb (e.g. spam, group mail,...,
etc). So, I tested these.

Here's my additionally question. Have we a monitoring tool for disk space?

/Edward

On Tue, Dec 2, 2008 at 11:42 AM, Andrew Purtell <ap...@apache.org> wrote:
> Edward,
>
> You are running with insufficient resources -- too little CPU
> for your task and too little disk for your data.
>
> If you are running a mapreduce task and DFS runs out of space
> for the temporary files, then you indeed should expect
> aberrant job status from the Hadoop job framework, for
> example such things as completion status running backwards.
>
> I do agree that under these circumstances HBase daemons
> should fail more gracefully, by entering some kind of
> degraded read only mode, if DFS is not totally dead. I
> suspect this is already on a to do list somewhere, and I
> vaguely recall a jira filed on that topic.
>
>   - Andy
>
>
>> From: Edward J. Yoon <ed...@apache.org>
>> Subject: Re: Bulk import question.
>> To: hbase-user@hadoop.apache.org, apurtell@apache.org
>> Date: Monday, December 1, 2008, 6:26 PM
>> It was by 'Datanode DiskOutOfSpaceException'. But, I
>> think daemons should not dead.
>>
>> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
>> <ed...@apache.org> wrote:
>> > Hmm. It often occurs to me. I'll check the logs.
>> >
>> > On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
>> <ap...@yahoo.com> wrote:
>> > > I think a 2 node cluster is simply too small for
>> > > the full load of everything.
>> > >
>
>
>
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by Andrew Purtell <ap...@apache.org>.
Edward,

You are running with insufficient resources -- too little CPU
for your task and too little disk for your data. 

If you are running a mapreduce task and DFS runs out of space
for the temporary files, then you indeed should expect
aberrant job status from the Hadoop job framework, for
example such things as completion status running backwards.

I do agree that under these circumstances HBase daemons
should fail more gracefully, by entering some kind of
degraded read only mode, if DFS is not totally dead. I 
suspect this is already on a to do list somewhere, and I
vaguely recall a jira filed on that topic.

   - Andy


> From: Edward J. Yoon <ed...@apache.org>
> Subject: Re: Bulk import question.
> To: hbase-user@hadoop.apache.org, apurtell@apache.org
> Date: Monday, December 1, 2008, 6:26 PM
> It was by 'Datanode DiskOutOfSpaceException'. But, I
> think daemons should not dead.
> 
> On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon
> <ed...@apache.org> wrote:
> > Hmm. It often occurs to me. I'll check the logs.
> >
> > On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell
> <ap...@yahoo.com> wrote:
> > > I think a 2 node cluster is simply too small for
> > > the full load of everything.
> > >



      

Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
It was by 'Datanode DiskOutOfSpaceException'. But, I think daemons
should not dead.

On Wed, Nov 26, 2008 at 1:08 PM, Edward J. Yoon <ed...@apache.org> wrote:
> Hmm. It often occurs to me. I'll check the logs.
>
> On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell <ap...@yahoo.com> wrote:
>> I think a 2 node cluster is simply too small for the full
>> load of everything.
>>
>> When I go that small I leave DFS out of the picture and run
>> HBase (in "local" mode) on top of a local file system on one
>> node and the jobtracker and tasktrackers on the other.
>> Even then I upped RAM on the HBase node to 3GB and run HBase
>> with 2GB heap for satisfactory results.
>>
>>   - Andy
>>
>>
>>> From: stack <st...@duboce.net>
>>> Subject: Re: Bulk import question.
>>> To: hbase-user@hadoop.apache.org
>>> Date: Thursday, November 20, 2008, 9:40 AM
>>> Edward J. Yoon wrote:
>>> > When I tried to bulk import, I received below error.
>>> (code is same
>>> > with hbase wiki)
>>> >
>>> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
>>> reduce 0%
>>> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
>>> reduce 0%
>>> >
>>> > Is it possible? And, hadoop/hbase daemons are crashed.
>>> >
>>> Percentage done can go in reverse if framework loses a
>>> bunch of maps (e.g. if crash).
>>>
>>> > - hadoop-0.18.2 & hbase-0.18.1
>>> > - 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
>>> > - 2 node cluster
>>> >
>>>
>>> So, on each node you have datanode, tasktracker, and
>>> regionserver running and then on one of the nodes you also
>>> have namenode plus jobtracker?  How many tasks per server?
>>> Two, the default?
>>>
>>> Check out your regionserver logs.  My guess is one likely
>>> crashed, perhaps because it was starved of time or because
>>> its datanode was not responding nicely because it was
>>> loaded.
>>>
>>> You've enabled DEBUG in hbase so you can get detail,
>>> upped your file descriptors and your xceiverCount count?
>>> (See FAQ for how).
>>>
>>> St.Ack
>>>
>>>
>>> >
>>> > ----
>>> > 08/11/20 15:23:36 INFO mapred.JobClient:  map 57%
>>> reduce 0%
>>> > 08/11/20 15:23:40 INFO mapred.JobClient:  map 59%
>>> reduce 0%
>>> > 08/11/20 15:23:45 INFO mapred.JobClient:  map 60%
>>> reduce 0%
>>> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
>>> reduce 0%
>>> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
>>> reduce 0%
>>> > 08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
>>> > attempt_200811131622_0019_m_000000_0, Status : FAILED
>>> >
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>> Trying to
>>> > contact region server 61.247.201.164:60020 for region
>>> > mail,,1227162121175, row '?:', but failed
>>> after 10 attempts.
>>> > Exceptions:
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> > java.io.IOException: Call failed on local exception
>>> >
>>> >         at
>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
>>> >         at
>>> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
>>> >         at
>>> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
>>> >         at
>>> com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
>>> >         at
>>> com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
>>> >         at
>>> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>>> >         at
>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>>> >         at
>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>>> >
>>> >
>>
>>
>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by "Edward J. Yoon" <ed...@apache.org>.
Hmm. It often occurs to me. I'll check the logs.

On Fri, Nov 21, 2008 at 9:46 AM, Andrew Purtell <ap...@yahoo.com> wrote:
> I think a 2 node cluster is simply too small for the full
> load of everything.
>
> When I go that small I leave DFS out of the picture and run
> HBase (in "local" mode) on top of a local file system on one
> node and the jobtracker and tasktrackers on the other.
> Even then I upped RAM on the HBase node to 3GB and run HBase
> with 2GB heap for satisfactory results.
>
>   - Andy
>
>
>> From: stack <st...@duboce.net>
>> Subject: Re: Bulk import question.
>> To: hbase-user@hadoop.apache.org
>> Date: Thursday, November 20, 2008, 9:40 AM
>> Edward J. Yoon wrote:
>> > When I tried to bulk import, I received below error.
>> (code is same
>> > with hbase wiki)
>> >
>> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
>> reduce 0%
>> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
>> reduce 0%
>> >
>> > Is it possible? And, hadoop/hbase daemons are crashed.
>> >
>> Percentage done can go in reverse if framework loses a
>> bunch of maps (e.g. if crash).
>>
>> > - hadoop-0.18.2 & hbase-0.18.1
>> > - 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
>> > - 2 node cluster
>> >
>>
>> So, on each node you have datanode, tasktracker, and
>> regionserver running and then on one of the nodes you also
>> have namenode plus jobtracker?  How many tasks per server?
>> Two, the default?
>>
>> Check out your regionserver logs.  My guess is one likely
>> crashed, perhaps because it was starved of time or because
>> its datanode was not responding nicely because it was
>> loaded.
>>
>> You've enabled DEBUG in hbase so you can get detail,
>> upped your file descriptors and your xceiverCount count?
>> (See FAQ for how).
>>
>> St.Ack
>>
>>
>> >
>> > ----
>> > 08/11/20 15:23:36 INFO mapred.JobClient:  map 57%
>> reduce 0%
>> > 08/11/20 15:23:40 INFO mapred.JobClient:  map 59%
>> reduce 0%
>> > 08/11/20 15:23:45 INFO mapred.JobClient:  map 60%
>> reduce 0%
>> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
>> reduce 0%
>> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
>> reduce 0%
>> > 08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
>> > attempt_200811131622_0019_m_000000_0, Status : FAILED
>> >
>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> Trying to
>> > contact region server 61.247.201.164:60020 for region
>> > mail,,1227162121175, row '?:', but failed
>> after 10 attempts.
>> > Exceptions:
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> > java.io.IOException: Call failed on local exception
>> >
>> >         at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
>> >         at
>> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
>> >         at
>> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
>> >         at
>> com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
>> >         at
>> com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
>> >         at
>> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>> >         at
>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>> >         at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>> >
>> >
>
>
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Bulk import question.

Posted by Andrew Purtell <ap...@yahoo.com>.
I think a 2 node cluster is simply too small for the full
load of everything.

When I go that small I leave DFS out of the picture and run
HBase (in "local" mode) on top of a local file system on one
node and the jobtracker and tasktrackers on the other.
Even then I upped RAM on the HBase node to 3GB and run HBase
with 2GB heap for satisfactory results. 

   - Andy


> From: stack <st...@duboce.net>
> Subject: Re: Bulk import question.
> To: hbase-user@hadoop.apache.org
> Date: Thursday, November 20, 2008, 9:40 AM
> Edward J. Yoon wrote:
> > When I tried to bulk import, I received below error.
> (code is same
> > with hbase wiki)
> > 
> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
> reduce 0%
> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
> reduce 0%
> > 
> > Is it possible? And, hadoop/hbase daemons are crashed.
> >   
> Percentage done can go in reverse if framework loses a
> bunch of maps (e.g. if crash).
> 
> > - hadoop-0.18.2 & hbase-0.18.1
> > - 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
> > - 2 node cluster
> >   
> 
> So, on each node you have datanode, tasktracker, and
> regionserver running and then on one of the nodes you also
> have namenode plus jobtracker?  How many tasks per server? 
> Two, the default?
> 
> Check out your regionserver logs.  My guess is one likely
> crashed, perhaps because it was starved of time or because
> its datanode was not responding nicely because it was
> loaded.
> 
> You've enabled DEBUG in hbase so you can get detail,
> upped your file descriptors and your xceiverCount count?
> (See FAQ for how).
> 
> St.Ack
> 
> 
> > 
> > ----
> > 08/11/20 15:23:36 INFO mapred.JobClient:  map 57%
> reduce 0%
> > 08/11/20 15:23:40 INFO mapred.JobClient:  map 59%
> reduce 0%
> > 08/11/20 15:23:45 INFO mapred.JobClient:  map 60%
> reduce 0%
> > 08/11/20 15:23:50 INFO mapred.JobClient:  map 62%
> reduce 0%
> > 08/11/20 15:27:10 INFO mapred.JobClient:  map 30%
> reduce 0%
> > 08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
> > attempt_200811131622_0019_m_000000_0, Status : FAILED
> >
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to
> > contact region server 61.247.201.164:60020 for region
> > mail,,1227162121175, row '?:', but failed
> after 10 attempts.
> > Exceptions:
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > java.io.IOException: Call failed on local exception
> > 
> >         at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
> >         at
> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
> >         at
> org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
> >         at
> com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
> >         at
> com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
> >         at
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
> >         at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> >         at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> > 
> >


      

Re: Bulk import question.

Posted by stack <st...@duboce.net>.
Edward J. Yoon wrote:
> When I tried to bulk import, I received below error. (code is same
> with hbase wiki)
>
> 08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%
>
> Is it possible? And, hadoop/hbase daemons are crashed.
>   
Percentage done can go in reverse if framework loses a bunch of maps 
(e.g. if crash).

> - hadoop-0.18.2 & hbase-0.18.1
> - 4 CPU, SATA hard disk, Physical Memory 16,626,844 KB
> - 2 node cluster
>   

So, on each node you have datanode, tasktracker, and regionserver 
running and then on one of the nodes you also have namenode plus 
jobtracker?  How many tasks per server?  Two, the default?

Check out your regionserver logs.  My guess is one likely crashed, 
perhaps because it was starved of time or because its datanode was not 
responding nicely because it was loaded.

You've enabled DEBUG in hbase so you can get detail, upped your file 
descriptors and your xceiverCount count? (See FAQ for how).

St.Ack


>
> ----
> 08/11/20 15:23:36 INFO mapred.JobClient:  map 57% reduce 0%
> 08/11/20 15:23:40 INFO mapred.JobClient:  map 59% reduce 0%
> 08/11/20 15:23:45 INFO mapred.JobClient:  map 60% reduce 0%
> 08/11/20 15:23:50 INFO mapred.JobClient:  map 62% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient:  map 30% reduce 0%
> 08/11/20 15:27:10 INFO mapred.JobClient: Task Id :
> attempt_200811131622_0019_m_000000_0, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server 61.247.201.164:60020 for region
> mail,,1227162121175, row '?:', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
> java.io.IOException: Call failed on local exception
>
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
>         at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:964)
>         at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:950)
>         at com.nhn.mail.Runner$InnerMap.map(Runner.java:59)
>         at com.nhn.mail.Runner$InnerMap.map(Runner.java:38)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>
>