You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by "Luis A. Sarmento" <lu...@telecom.pt> on 2011/12/10 18:51:52 UTC

Strange problem while scanning HBase table with PIG

Hi,

I'm scanning a relatively large table stored in HBase using pig.
I've got a column family named event_data with 3 columns (tab_event_string, date and Id).
The table is indexed by a key which has a event code and a time stamp.
Nothing special about this table except for the fact that it is relatively large.

Below is the PIG code for scanning the table (parameters $GTE and $LTE are basically the begin and end timestamps).

RAW_1 = LOAD 'my_events'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
        'event_data:tab_event_string event_data:date event_data:Id',
        '-caching 100 -limit 2000 -gte 000100_$GTE -lte=000100_$LTE -caster HBaseBinaryConverter'
)
AS (tab_event_string:bytearray,date:bytearray,Id:bytearray);

Now, the problem is that one of the mappers that scan this table always takes too long to initialize. I always get two messages (for attempts 0 and 1)

Task attempt_201107151702_48728_m_000000_0 failed to report status for 600 seconds. Killing!
Task attempt_201107151702_48728_m_000000_1 failed to report status for 600 seconds. Killing!


And once it initializes - e.g at attempt 2 - I always end up with a scanner exception:

org.apache.hadoop.hbase.client.ScannerTimeoutException: 61056ms passed since the last invocation, timeout is currently set to 60000
	at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1128)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:143)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142)
	at org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat$HBaseTableRecordReader.nextKeyValue(HBaseTableInputFormat.java:162)
	at org.apache.pig.backend.hadoop.hbase.HBaseStorage.getNext(HBaseStorage.java:319)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 5364262096576298375
	at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1794)
	at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1012)
	at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1119)
	... 11 more

Other taks attempts usually also fail:

Task attempt_201107151702_48728_m_000000_3 failed to report status for 600 seconds. Killing!

...

Now, this *only happens for one mapper* --> mapper_0
No matter how I change my scanning parameters - different begin / end timestamp; more data vs less data; change caching ,etc.. - , it is always ends up like this: mapper 0 fails, even when 500 hundreds other succeed.

I really don't have a clue of why this is happening. The most intriguing is that this happens always for mapper 0, no matter on what machine of the cluster it runs.

Does anyone have a clue about this?

Thanks!

Luis

Re: Strange problem while scanning HBase table with PIG

Posted by Stack <st...@duboce.net>.

On Sat, Dec 10, 2011 at 9:51 AM, Luis A. Sarmento
<lu...@telecom.pt> wrote:
> Now, this *only happens for one mapper* --> mapper_0
> No matter how I change my scanning parameters - different begin / end timestamp; more data vs less data; change caching ,etc.. - , it is always ends up like this: mapper 0 fails, even when 500 hundreds other succeed.
>

Could this mapper have a massive record in it or be full of deletes,
so many deletes, it takes > 60 seconds to traverse them.

You have a lease expired going on.  This happens because either client
is taking > 60 second before it goes back to the server (server-side,
the scan lease has expired) or we're down in the depths of the server
traversing data looking for something to return and this is going on
longer than 60 seconds.

Scan the mapper_0 range and see what comes back.

Could change the client cache size too.  Perhaps you are getting lots
of rows on each invocation, so many rows, its taking you a while to
process the all?  This section of book has some notes that might be of
use: http://hbase.apache.org/book.html#perf.reading

Add logging?  See if its server-side or client-side problem?

St.Ack