You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ferdy Galema <fe...@kalooga.com> on 2012/07/16 14:35:01 UTC

simple inputformat to ignore lease and timeout exceptions

Some mapred jobs running scans on our HBase could not succeed because of
the dreaded LeaseException or ScannerTimeoutException, even with
hbase.client.scanner.caching set to 1 and long timeout properties. Mind you
that no row is ever bigger than 5MB (sure it's bigger then most use cases
but still it's not excessive), but most are a lot smaller. A simple
solution that SKIPS the regions after the timeout/lease exceptions occurs
worked perfectly fine for us. (Just a few regions of our several thousand).
This is a LOT better than the mapreduce framework trying to reprocess the
same mapper (region) for several times and after that failing the entire
job. Because in our case the actual work is done in the reducer, this job
failing is a big pain.

I just wanted this solution with you. This is for 0.90.x. Also I'm aware
of HBASE-5757 so this is for anyone that cannot use that fix yet.  (Or if
that one does not work because retrying is not an option). So if you can
afford some regions to be skipped (or actually "not fully processed"), use
the following inputformat. (Call
job.setInputFormatClass(IgnoreTimeoutsTableInputFormat.class) right after
the TableMapReduceUtil.initTableMapperJob(...) call).

public class IgnoreTimeoutsTableInputFormat extends TableInputFormat{
  private static final Log LOG = LogFactory.getLog(new Object() {
  }.getClass().getEnclosingClass());

  @Override
  public RecordReader<ImmutableBytesWritable, Result> createRecordReader(
      InputSplit split, TaskAttemptContext context) throws IOException {
    TableRecordReader tableRecordReader = new TableRecordReader() {
      @Override
      public boolean nextKeyValue() throws IOException,
InterruptedException {
        try {
          boolean nextKeyValue = super.nextKeyValue();
          return nextKeyValue;
        } catch (LeaseException e) {
          LOG.warn("fallthrough lease exc", e);
        } catch (ScannerTimeoutException e) {
          LOG.warn("fallthrough scanner timeout exc", e);
        }
        return false;
      }
    };
    setTableRecordReader(tableRecordReader);
    return super.createRecordReader(split, context);
  }
}

Nevertheless, we are still trying to understand why some regions always
fail...