You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by schubert zhang <zs...@gmail.com> on 2009/02/28 11:56:23 UTC

strange region name, is it right?

I have being used HBase and Hadoop for 5 months.

My testbed have 5node(1mastar and 4slaves)
Hadoop-0.19.1
HBase-0.19.0

1. I use the TeraGen mapreduce job of hadoop examples, to generate files
with random key-value paires.
    I just create a 1G data and  another 10G data for later test.

2. Then write a job to read these TeraGen files and insert each record's
key-value to a HBase table.
    (create 'sort1g', {NAME => 't', VERSIONS => 1}
     (create 'sort10g', {NAME => 't', VERSIONS => 1}
    I want use this insert jobs to simulate the TeraSort, since HBase
automatically sort rows.

3. after finish the insert jobs. On the web interface of HBase, I found
following strange thing:

Name Region Server Encoded Name Start Key End Key
......
sort10g,%ql`{^8Bcf,1235730412828   nd2-rack0-cloud:60020   155375382
 %ql`{^8Bcf   &YK&Uop0a=
sort10g,&YK&Uop0a=,1235730749832  nd1-rack0-cloud:60020  1574155935
 &YK&Uop0a=  'B'Zp+!]Tb
sort10g,'B'Zp+!]Tb,1235730749832  nd1-rack0-cloud:60020  395792177
 'B'Zp+!]Tb  ()o:
sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
sort10g,(qYp"7;j2$,1235730731006  nd1-rack0-cloud:60020  2143364419
 (qYp"7;j2$  )Z/?>:ZM3Z
sort10g,)Z/?>:ZM3Z,1235730853698  nd2-rack0-cloud:60020  440987412
 )Z/?>:ZM3Z  *BuVHF#1ME
.......
sort10g,:Qt-(8;Y>i,1235730441379   nd1-rack0-cloud:60020   1461025497
 :Qt-(8;Y>i   ;;Vg!IT[d"
sort10g,;;Vg!IT[d",1235730461102  nd1-rack0-cloud:60020  36776992
 ;;Vg!IT[d"  <$#
sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
sort10g,=VyK?xTtI`,1235730334262  nd3-rack0-cloud:60020  1165072084
 =VyK?xTtI`  >A274Dj=vU
 .......
sort10g,s#Y}pGP|{3,1235730476424   nd1-rack0-cloud:60020   1728348677
 s#Y}pGP|{3   soWA+0=0Ao
sort10g,soWA+0=0Ao,1235730487163  nd1-rack0-cloud:60020  1275380223
 soWA+0=0Ao  t\<
sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
sort10g,uI-1OW2g=t,1235730515195  nd1-rack0-cloud:60020  232566103
 uI-1OW2g=t  v6'-_5E]7'


In above lines, some look not like normal:
sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t


Coud you please tell me it is right or not.

Re: strange region name, is it right?

Posted by schubert zhang <zs...@gmail.com>.
Now, the RowCounter works for both cases.
I should use:
    new WhileMatchRowFilter(new StopRowFilter(endRow))
for the endRow.

but I previously used new StopRowFilter(m_endRow) directly.

But, I cannot explain why the wrong used StopRowFilter can work for 1GB
table t1.

Schubert

On Sun, Mar 1, 2009 at 5:19 PM, schubert zhang <zs...@gmail.com> wrote:

> I have done two exercises of TeraDataGen and TeraDataSort:
> (1)  1GB data -> table t1
> (2)  2GB data -> table t2
>
> Then I write a mapred job to do the RowCounter, each mapper count on region
> and then do combiner and then do reducer.
>
> The RowCounter job for the table t1(1GB) works fine and finished in 3
> minutes.
> But the RowCounter job for the table t2(10GB) cannot complete. I checked
> each map task's status, and found it is dead-locked in the Spill step, each
> map only spilled 2 but there is 3 spills for each map task.
>
> I think the map task (child) is dead-locked when spilling the map output (SpillThread)
> and openScanner....
>
> Schubert
>
>
> On Sun, Mar 1, 2009 at 6:50 AM, stack <st...@duboce.net> wrote:
>
>> Client is trying to open scanner on 10.24.1.14 (or .12).  Can you look in
>> regionserver logs on that machine and see if you can see whats holding it
>> up?  It never moves on from here?
>> St.Ack
>>
>> On Sat, Feb 28, 2009 at 3:07 AM, schubert zhang <zs...@gmail.com>
>> wrote:
>>
>> > And another problem.
>> >
>> > We I ran RowCounter job to count the rows of sort10g table, the job's
>> map
>> > child process is locked and cannot complete.
>> >
>> > [schubert@nd1-rack0-cloud bin]$ jps
>> > 14069 Child
>> > 13124 Child
>> > 7081 HRegionServer
>> > 14190 Child
>> > 6841 DataNode
>> > 14158 Child
>> > 12827 TaskTracker
>> > 14266 Child
>> > 14333 Jps
>> > [schubert@nd1-rack0-cloud bin]$ jstack -l 14266
>> > 2009-02-28 18:01:09
>> > Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed
>> mode):
>> >
>> > "Attach Listener" daemon prio=10 tid=0x0000000049801c00 nid=0x382e
>> waiting
>> > on condition [0x0000000000000000..0x0000000000000000]
>> >   java.lang.Thread.State: RUNNABLE
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "IPC Client (47) connection to /10.24.1.14:60020 from an unknown user"
>> > daemon prio=10 tid=0x00002aaaf844f800 nid=0x381a runnable
>> > [0x000000004151c000..0x000000004151cb80]
>> >   java.lang.Thread.State: RUNNABLE
>> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
>> >        at
>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>> >        - locked <0x00002aaabe0f6110> (a sun.nio.ch.Util$1)
>> >        - locked <0x00002aaabe0f60f8> (a
>> > java.util.Collections$UnmodifiableSet)
>> >        - locked <0x00002aaabe0f5d68> (a sun.nio.ch.EPollSelectorImpl)
>> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>> >        at
>> >
>> >
>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
>> >        at
>> >
>> >
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
>> >        at
>> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
>> >        at
>> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
>> >        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
>> >        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>> >        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>> >        - locked <0x00002aaadf9168f8> (a java.io.BufferedInputStream)
>> >        at java.io.DataInputStream.readInt(DataInputStream.java:370)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "IPC Client (47) connection to /10.24.1.12:60020 from an unknown user"
>> > daemon prio=10 tid=0x00002aaaf82bf000 nid=0x37d0 in Object.wait()
>> > [0x000000004161d000..0x000000004161dd00]
>> >   java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
>> >        - locked <0x00002aaabe13ea18> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Connection)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "SpillThread" daemon prio=10 tid=0x00002aaaf827fc00 nid=0x37cd waiting
>> on
>> > condition [0x000000004131a000..0x000000004131ac80]
>> >   java.lang.Thread.State: WAITING (parking)
>> >        at sun.misc.Unsafe.park(Native Method)
>> >        - parking to wait for  <0x00002aaabe0ebc80> (a
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> >        at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>> >        at
>> >
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>> >        at
>> >
>> >
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:882)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Comm thread for attempt_200902271728_0012_m_000017_1" daemon prio=10
>> > tid=0x00002aaaf82d7c00 nid=0x37cc waiting on condition
>> > [0x0000000041219000..0x0000000041219b00]
>> >   java.lang.Thread.State: TIMED_WAITING (sleeping)
>> >        at java.lang.Thread.sleep(Native Method)
>> >        at org.apache.hadoop.mapred.Task$1.run(Task.java:403)
>> >        at java.lang.Thread.run(Thread.java:619)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Thread for syncLogs" daemon prio=10 tid=0x00002aaaf82e9800 nid=0x37ca
>> > waiting on condition [0x0000000041017000..0x0000000041017a00]
>> >   java.lang.Thread.State: TIMED_WAITING (sleeping)
>> >        at java.lang.Thread.sleep(Native Method)
>> >        at org.apache.hadoop.mapred.Child$1.run(Child.java:77)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "IPC Client (47) connection to /127.0.0.1:33444 from an unknown user"
>> > daemon
>> > prio=10 tid=0x00002aaaf81efc00 nid=0x37c9 in Object.wait()
>> > [0x0000000040f16000..0x0000000040f16a80]
>> >   java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at
>> > org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:396)
>> >        - locked <0x00002aaabe0ebf48> (a
>> > org.apache.hadoop.ipc.Client$Connection)
>> >        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Low Memory Detector" daemon prio=10 tid=0x000000004979cc00 nid=0x37c7
>> > runnable [0x0000000000000000..0x0000000000000000]
>> >   java.lang.Thread.State: RUNNABLE
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "CompilerThread1" daemon prio=10 tid=0x000000004979a800 nid=0x37c6
>> waiting
>> > on condition [0x0000000000000000..0x0000000040c12450]
>> >   java.lang.Thread.State: RUNNABLE
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "CompilerThread0" daemon prio=10 tid=0x0000000049797000 nid=0x37c5
>> waiting
>> > on condition [0x0000000000000000..0x0000000040b11520]
>> >   java.lang.Thread.State: RUNNABLE
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Signal Dispatcher" daemon prio=10 tid=0x0000000049795800 nid=0x37c4
>> > runnable [0x0000000000000000..0x0000000040a11790]
>> >   java.lang.Thread.State: RUNNABLE
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Finalizer" daemon prio=10 tid=0x000000004976ac00 nid=0x37c3 in
>> > Object.wait() [0x0000000040910000..0x0000000040910b80]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
>> >        - locked <0x00002aaabe0db6c8> (a
>> java.lang.ref.ReferenceQueue$Lock)
>> >        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
>> >        at
>> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "Reference Handler" daemon prio=10 tid=0x0000000049769400 nid=0x37c2 in
>> > Object.wait() [0x000000004080f000..0x000000004080fa00]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at java.lang.Object.wait(Object.java:485)
>> >        at
>> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>> >        - locked <0x00002aaabe0ec428> (a java.lang.ref.Reference$Lock)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "main" prio=10 tid=0x00000000496e2000 nid=0x37bc in Object.wait()
>> > [0x0000000040209000..0x0000000040209ec0]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        at java.lang.Object.wait(Object.java:485)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>> >        - locked <0x00002aaadf91b250> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>> >        at $Proxy3.openScanner(Unknown Source)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:86)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:77)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:34)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:828)
>> >        at
>> >
>> >
>> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1582)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1645)
>> >        at
>> > net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
>> > Source)
>> >        at
>> > net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
>> > Source)
>> >        at
>> >
>> >
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
>> >        - locked <0x00002aaabe169bf0> (a
>> > org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>> >        at
>> >
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
>> >        - locked <0x00002aaabe169bf0> (a
>> > org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>> >        at org.apache.hadoop.mapred.Child.main(Child.java:158)
>> >
>> >   Locked ownable synchronizers:
>> >        - None
>> >
>> > "VM Thread" prio=10 tid=0x0000000049764000 nid=0x37c1 runnable
>> >
>> > "GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000496ec000
>> nid=0x37bd
>> > runnable
>> >
>> > "GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000496ed400
>> nid=0x37be
>> > runnable
>> >
>> > "GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000496ee800
>> nid=0x37bf
>> > runnable
>> >
>> > "GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000496efc00
>> nid=0x37c0
>> > runnable
>> >
>> > "VM Periodic Task Thread" prio=10 tid=0x000000004979e800 nid=0x37c8
>> waiting
>> > on condition
>> >
>> > JNI global references: 847
>> >
>>
>
>

Re: strange region name, is it right?

Posted by schubert zhang <zs...@gmail.com>.
I have done two exercises of TeraDataGen and TeraDataSort:
(1)  1GB data -> table t1
(2)  2GB data -> table t2

Then I write a mapred job to do the RowCounter, each mapper count on region
and then do combiner and then do reducer.

The RowCounter job for the table t1(1GB) works fine and finished in 3
minutes.
But the RowCounter job for the table t2(10GB) cannot complete. I checked
each map task's status, and found it is dead-locked in the Spill step, each
map only spilled 2 but there is 3 spills for each map task.

I think the map task (child) is dead-locked when spilling the map
output (SpillThread)
and openScanner....

Schubert

On Sun, Mar 1, 2009 at 6:50 AM, stack <st...@duboce.net> wrote:

> Client is trying to open scanner on 10.24.1.14 (or .12).  Can you look in
> regionserver logs on that machine and see if you can see whats holding it
> up?  It never moves on from here?
> St.Ack
>
> On Sat, Feb 28, 2009 at 3:07 AM, schubert zhang <zs...@gmail.com> wrote:
>
> > And another problem.
> >
> > We I ran RowCounter job to count the rows of sort10g table, the job's map
> > child process is locked and cannot complete.
> >
> > [schubert@nd1-rack0-cloud bin]$ jps
> > 14069 Child
> > 13124 Child
> > 7081 HRegionServer
> > 14190 Child
> > 6841 DataNode
> > 14158 Child
> > 12827 TaskTracker
> > 14266 Child
> > 14333 Jps
> > [schubert@nd1-rack0-cloud bin]$ jstack -l 14266
> > 2009-02-28 18:01:09
> > Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed mode):
> >
> > "Attach Listener" daemon prio=10 tid=0x0000000049801c00 nid=0x382e
> waiting
> > on condition [0x0000000000000000..0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "IPC Client (47) connection to /10.24.1.14:60020 from an unknown user"
> > daemon prio=10 tid=0x00002aaaf844f800 nid=0x381a runnable
> > [0x000000004151c000..0x000000004151cb80]
> >   java.lang.Thread.State: RUNNABLE
> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
> >        at
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> >        - locked <0x00002aaabe0f6110> (a sun.nio.ch.Util$1)
> >        - locked <0x00002aaabe0f60f8> (a
> > java.util.Collections$UnmodifiableSet)
> >        - locked <0x00002aaabe0f5d68> (a sun.nio.ch.EPollSelectorImpl)
> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> >        at
> >
> >
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
> >        at
> >
> >
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
> >        at
> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
> >        at
> > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
> >        at java.io.FilterInputStream.read(FilterInputStream.java:116)
> >        at
> >
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
> >        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >        - locked <0x00002aaadf9168f8> (a java.io.BufferedInputStream)
> >        at java.io.DataInputStream.readInt(DataInputStream.java:370)
> >        at
> >
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
> >        at
> >
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "IPC Client (47) connection to /10.24.1.12:60020 from an unknown user"
> > daemon prio=10 tid=0x00002aaaf82bf000 nid=0x37d0 in Object.wait()
> > [0x000000004161d000..0x000000004161dd00]
> >   java.lang.Thread.State: TIMED_WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        at
> >
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
> >        - locked <0x00002aaabe13ea18> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Connection)
> >        at
> >
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "SpillThread" daemon prio=10 tid=0x00002aaaf827fc00 nid=0x37cd waiting on
> > condition [0x000000004131a000..0x000000004131ac80]
> >   java.lang.Thread.State: WAITING (parking)
> >        at sun.misc.Unsafe.park(Native Method)
> >        - parking to wait for  <0x00002aaabe0ebc80> (a
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> >        at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> >        at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
> >        at
> >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:882)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Comm thread for attempt_200902271728_0012_m_000017_1" daemon prio=10
> > tid=0x00002aaaf82d7c00 nid=0x37cc waiting on condition
> > [0x0000000041219000..0x0000000041219b00]
> >   java.lang.Thread.State: TIMED_WAITING (sleeping)
> >        at java.lang.Thread.sleep(Native Method)
> >        at org.apache.hadoop.mapred.Task$1.run(Task.java:403)
> >        at java.lang.Thread.run(Thread.java:619)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Thread for syncLogs" daemon prio=10 tid=0x00002aaaf82e9800 nid=0x37ca
> > waiting on condition [0x0000000041017000..0x0000000041017a00]
> >   java.lang.Thread.State: TIMED_WAITING (sleeping)
> >        at java.lang.Thread.sleep(Native Method)
> >        at org.apache.hadoop.mapred.Child$1.run(Child.java:77)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "IPC Client (47) connection to /127.0.0.1:33444 from an unknown user"
> > daemon
> > prio=10 tid=0x00002aaaf81efc00 nid=0x37c9 in Object.wait()
> > [0x0000000040f16000..0x0000000040f16a80]
> >   java.lang.Thread.State: TIMED_WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        at
> > org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:396)
> >        - locked <0x00002aaabe0ebf48> (a
> > org.apache.hadoop.ipc.Client$Connection)
> >        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Low Memory Detector" daemon prio=10 tid=0x000000004979cc00 nid=0x37c7
> > runnable [0x0000000000000000..0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "CompilerThread1" daemon prio=10 tid=0x000000004979a800 nid=0x37c6
> waiting
> > on condition [0x0000000000000000..0x0000000040c12450]
> >   java.lang.Thread.State: RUNNABLE
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "CompilerThread0" daemon prio=10 tid=0x0000000049797000 nid=0x37c5
> waiting
> > on condition [0x0000000000000000..0x0000000040b11520]
> >   java.lang.Thread.State: RUNNABLE
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Signal Dispatcher" daemon prio=10 tid=0x0000000049795800 nid=0x37c4
> > runnable [0x0000000000000000..0x0000000040a11790]
> >   java.lang.Thread.State: RUNNABLE
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Finalizer" daemon prio=10 tid=0x000000004976ac00 nid=0x37c3 in
> > Object.wait() [0x0000000040910000..0x0000000040910b80]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
> >        - locked <0x00002aaabe0db6c8> (a
> java.lang.ref.ReferenceQueue$Lock)
> >        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
> >        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "Reference Handler" daemon prio=10 tid=0x0000000049769400 nid=0x37c2 in
> > Object.wait() [0x000000004080f000..0x000000004080fa00]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        at java.lang.Object.wait(Object.java:485)
> >        at
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
> >        - locked <0x00002aaabe0ec428> (a java.lang.ref.Reference$Lock)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "main" prio=10 tid=0x00000000496e2000 nid=0x37bc in Object.wait()
> > [0x0000000040209000..0x0000000040209ec0]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        at java.lang.Object.wait(Object.java:485)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
> >        - locked <0x00002aaadf91b250> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
> >        at $Proxy3.openScanner(Unknown Source)
> >        at
> >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:86)
> >        at
> >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:77)
> >        at
> >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:34)
> >        at
> >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:828)
> >        at
> >
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1582)
> >        at
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1645)
> >        at
> > net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
> > Source)
> >        at
> > net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
> > Source)
> >        at
> >
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
> >        - locked <0x00002aaabe169bf0> (a
> > org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
> >        at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
> >        - locked <0x00002aaabe169bf0> (a
> > org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:158)
> >
> >   Locked ownable synchronizers:
> >        - None
> >
> > "VM Thread" prio=10 tid=0x0000000049764000 nid=0x37c1 runnable
> >
> > "GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000496ec000 nid=0x37bd
> > runnable
> >
> > "GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000496ed400 nid=0x37be
> > runnable
> >
> > "GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000496ee800 nid=0x37bf
> > runnable
> >
> > "GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000496efc00 nid=0x37c0
> > runnable
> >
> > "VM Periodic Task Thread" prio=10 tid=0x000000004979e800 nid=0x37c8
> waiting
> > on condition
> >
> > JNI global references: 847
> >
>

Re: strange region name, is it right?

Posted by stack <st...@duboce.net>.
Client is trying to open scanner on 10.24.1.14 (or .12).  Can you look in
regionserver logs on that machine and see if you can see whats holding it
up?  It never moves on from here?
St.Ack

On Sat, Feb 28, 2009 at 3:07 AM, schubert zhang <zs...@gmail.com> wrote:

> And another problem.
>
> We I ran RowCounter job to count the rows of sort10g table, the job's map
> child process is locked and cannot complete.
>
> [schubert@nd1-rack0-cloud bin]$ jps
> 14069 Child
> 13124 Child
> 7081 HRegionServer
> 14190 Child
> 6841 DataNode
> 14158 Child
> 12827 TaskTracker
> 14266 Child
> 14333 Jps
> [schubert@nd1-rack0-cloud bin]$ jstack -l 14266
> 2009-02-28 18:01:09
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed mode):
>
> "Attach Listener" daemon prio=10 tid=0x0000000049801c00 nid=0x382e waiting
> on condition [0x0000000000000000..0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>        - None
>
> "IPC Client (47) connection to /10.24.1.14:60020 from an unknown user"
> daemon prio=10 tid=0x00002aaaf844f800 nid=0x381a runnable
> [0x000000004151c000..0x000000004151cb80]
>   java.lang.Thread.State: RUNNABLE
>        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
>        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>        - locked <0x00002aaabe0f6110> (a sun.nio.ch.Util$1)
>        - locked <0x00002aaabe0f60f8> (a
> java.util.Collections$UnmodifiableSet)
>        - locked <0x00002aaabe0f5d68> (a sun.nio.ch.EPollSelectorImpl)
>        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
>        at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
>        at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
>        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>        - locked <0x00002aaadf9168f8> (a java.io.BufferedInputStream)
>        at java.io.DataInputStream.readInt(DataInputStream.java:370)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)
>
>   Locked ownable synchronizers:
>        - None
>
> "IPC Client (47) connection to /10.24.1.12:60020 from an unknown user"
> daemon prio=10 tid=0x00002aaaf82bf000 nid=0x37d0 in Object.wait()
> [0x000000004161d000..0x000000004161dd00]
>   java.lang.Thread.State: TIMED_WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
>        - locked <0x00002aaabe13ea18> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection)
>        at
>
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)
>
>   Locked ownable synchronizers:
>        - None
>
> "SpillThread" daemon prio=10 tid=0x00002aaaf827fc00 nid=0x37cd waiting on
> condition [0x000000004131a000..0x000000004131ac80]
>   java.lang.Thread.State: WAITING (parking)
>        at sun.misc.Unsafe.park(Native Method)
>        - parking to wait for  <0x00002aaabe0ebc80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>        at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>        at
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:882)
>
>   Locked ownable synchronizers:
>        - None
>
> "Comm thread for attempt_200902271728_0012_m_000017_1" daemon prio=10
> tid=0x00002aaaf82d7c00 nid=0x37cc waiting on condition
> [0x0000000041219000..0x0000000041219b00]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>        at java.lang.Thread.sleep(Native Method)
>        at org.apache.hadoop.mapred.Task$1.run(Task.java:403)
>        at java.lang.Thread.run(Thread.java:619)
>
>   Locked ownable synchronizers:
>        - None
>
> "Thread for syncLogs" daemon prio=10 tid=0x00002aaaf82e9800 nid=0x37ca
> waiting on condition [0x0000000041017000..0x0000000041017a00]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>        at java.lang.Thread.sleep(Native Method)
>        at org.apache.hadoop.mapred.Child$1.run(Child.java:77)
>
>   Locked ownable synchronizers:
>        - None
>
> "IPC Client (47) connection to /127.0.0.1:33444 from an unknown user"
> daemon
> prio=10 tid=0x00002aaaf81efc00 nid=0x37c9 in Object.wait()
> [0x0000000040f16000..0x0000000040f16a80]
>   java.lang.Thread.State: TIMED_WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        at
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:396)
>        - locked <0x00002aaabe0ebf48> (a
> org.apache.hadoop.ipc.Client$Connection)
>        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
>
>   Locked ownable synchronizers:
>        - None
>
> "Low Memory Detector" daemon prio=10 tid=0x000000004979cc00 nid=0x37c7
> runnable [0x0000000000000000..0x0000000000000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>        - None
>
> "CompilerThread1" daemon prio=10 tid=0x000000004979a800 nid=0x37c6 waiting
> on condition [0x0000000000000000..0x0000000040c12450]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>        - None
>
> "CompilerThread0" daemon prio=10 tid=0x0000000049797000 nid=0x37c5 waiting
> on condition [0x0000000000000000..0x0000000040b11520]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>        - None
>
> "Signal Dispatcher" daemon prio=10 tid=0x0000000049795800 nid=0x37c4
> runnable [0x0000000000000000..0x0000000040a11790]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>        - None
>
> "Finalizer" daemon prio=10 tid=0x000000004976ac00 nid=0x37c3 in
> Object.wait() [0x0000000040910000..0x0000000040910b80]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
>        - locked <0x00002aaabe0db6c8> (a java.lang.ref.ReferenceQueue$Lock)
>        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
>        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>
>   Locked ownable synchronizers:
>        - None
>
> "Reference Handler" daemon prio=10 tid=0x0000000049769400 nid=0x37c2 in
> Object.wait() [0x000000004080f000..0x000000004080fa00]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        at java.lang.Object.wait(Object.java:485)
>        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>        - locked <0x00002aaabe0ec428> (a java.lang.ref.Reference$Lock)
>
>   Locked ownable synchronizers:
>        - None
>
> "main" prio=10 tid=0x00000000496e2000 nid=0x37bc in Object.wait()
> [0x0000000040209000..0x0000000040209ec0]
>   java.lang.Thread.State: WAITING (on object monitor)
>        at java.lang.Object.wait(Native Method)
>        at java.lang.Object.wait(Object.java:485)
>        at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>        - locked <0x00002aaadf91b250> (a
> org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>        at $Proxy3.openScanner(Unknown Source)
>        at
>
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:86)
>        at
>
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:77)
>        at
>
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:34)
>        at
>
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:828)
>        at
>
> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1582)
>        at
> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1645)
>        at
> net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
> Source)
>        at
> net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown
> Source)
>        at
>
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
>        - locked <0x00002aaabe169bf0> (a
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>        at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
>        - locked <0x00002aaabe169bf0> (a
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>        at org.apache.hadoop.mapred.Child.main(Child.java:158)
>
>   Locked ownable synchronizers:
>        - None
>
> "VM Thread" prio=10 tid=0x0000000049764000 nid=0x37c1 runnable
>
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000496ec000 nid=0x37bd
> runnable
>
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000496ed400 nid=0x37be
> runnable
>
> "GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000496ee800 nid=0x37bf
> runnable
>
> "GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000496efc00 nid=0x37c0
> runnable
>
> "VM Periodic Task Thread" prio=10 tid=0x000000004979e800 nid=0x37c8 waiting
> on condition
>
> JNI global references: 847
>

Re: strange region name, is it right?

Posted by schubert zhang <zs...@gmail.com>.
And another problem.

We I ran RowCounter job to count the rows of sort10g table, the job's map
child process is locked and cannot complete.

[schubert@nd1-rack0-cloud bin]$ jps
14069 Child
13124 Child
7081 HRegionServer
14190 Child
6841 DataNode
14158 Child
12827 TaskTracker
14266 Child
14333 Jps
[schubert@nd1-rack0-cloud bin]$ jstack -l 14266
2009-02-28 18:01:09
Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed mode):

"Attach Listener" daemon prio=10 tid=0x0000000049801c00 nid=0x382e waiting
on condition [0x0000000000000000..0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"IPC Client (47) connection to /10.24.1.14:60020 from an unknown user"
daemon prio=10 tid=0x00002aaaf844f800 nid=0x381a runnable
[0x000000004151c000..0x000000004151cb80]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0x00002aaabe0f6110> (a sun.nio.ch.Util$1)
        - locked <0x00002aaabe0f60f8> (a
java.util.Collections$UnmodifiableSet)
        - locked <0x00002aaabe0f5d68> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
        at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
        at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
        at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        - locked <0x00002aaadf9168f8> (a java.io.BufferedInputStream)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)

   Locked ownable synchronizers:
        - None

"IPC Client (47) connection to /10.24.1.12:60020 from an unknown user"
daemon prio=10 tid=0x00002aaaf82bf000 nid=0x37d0 in Object.wait()
[0x000000004161d000..0x000000004161dd00]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
        - locked <0x00002aaabe13ea18> (a
org.apache.hadoop.hbase.ipc.HBaseClient$Connection)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)

   Locked ownable synchronizers:
        - None

"SpillThread" daemon prio=10 tid=0x00002aaaf827fc00 nid=0x37cd waiting on
condition [0x000000004131a000..0x000000004131ac80]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00002aaabe0ebc80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:882)

   Locked ownable synchronizers:
        - None

"Comm thread for attempt_200902271728_0012_m_000017_1" daemon prio=10
tid=0x00002aaaf82d7c00 nid=0x37cc waiting on condition
[0x0000000041219000..0x0000000041219b00]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:403)
        at java.lang.Thread.run(Thread.java:619)

   Locked ownable synchronizers:
        - None

"Thread for syncLogs" daemon prio=10 tid=0x00002aaaf82e9800 nid=0x37ca
waiting on condition [0x0000000041017000..0x0000000041017a00]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Child$1.run(Child.java:77)

   Locked ownable synchronizers:
        - None

"IPC Client (47) connection to /127.0.0.1:33444 from an unknown user" daemon
prio=10 tid=0x00002aaaf81efc00 nid=0x37c9 in Object.wait()
[0x0000000040f16000..0x0000000040f16a80]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at
org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:396)
        - locked <0x00002aaabe0ebf48> (a
org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)

   Locked ownable synchronizers:
        - None

"Low Memory Detector" daemon prio=10 tid=0x000000004979cc00 nid=0x37c7
runnable [0x0000000000000000..0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"CompilerThread1" daemon prio=10 tid=0x000000004979a800 nid=0x37c6 waiting
on condition [0x0000000000000000..0x0000000040c12450]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"CompilerThread0" daemon prio=10 tid=0x0000000049797000 nid=0x37c5 waiting
on condition [0x0000000000000000..0x0000000040b11520]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Signal Dispatcher" daemon prio=10 tid=0x0000000049795800 nid=0x37c4
runnable [0x0000000000000000..0x0000000040a11790]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Finalizer" daemon prio=10 tid=0x000000004976ac00 nid=0x37c3 in
Object.wait() [0x0000000040910000..0x0000000040910b80]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
        - locked <0x00002aaabe0db6c8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

   Locked ownable synchronizers:
        - None

"Reference Handler" daemon prio=10 tid=0x0000000049769400 nid=0x37c2 in
Object.wait() [0x000000004080f000..0x000000004080fa00]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x00002aaabe0ec428> (a java.lang.ref.Reference$Lock)

   Locked ownable synchronizers:
        - None

"main" prio=10 tid=0x00000000496e2000 nid=0x37bc in Object.wait()
[0x0000000040209000..0x0000000040209ec0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
        - locked <0x00002aaadf91b250> (a
org.apache.hadoop.hbase.ipc.HBaseClient$Call)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
        at $Proxy3.openScanner(Unknown Source)
        at
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:86)
        at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:77)
        at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:34)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:828)
        at
org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1582)
        at
org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1645)
        at
net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown Source)
        at
net.sandmill.examples.mapred.hbase.TableRowRecordReader.next(Unknown Source)
        at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
        - locked <0x00002aaabe169bf0> (a
org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
        at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
        - locked <0x00002aaabe169bf0> (a
org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.Child.main(Child.java:158)

   Locked ownable synchronizers:
        - None

"VM Thread" prio=10 tid=0x0000000049764000 nid=0x37c1 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000496ec000 nid=0x37bd
runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000496ed400 nid=0x37be
runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000496ee800 nid=0x37bf
runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000496efc00 nid=0x37c0
runnable

"VM Periodic Task Thread" prio=10 tid=0x000000004979e800 nid=0x37c8 waiting
on condition

JNI global references: 847

Re: strange region name, is it right?

Posted by schubert zhang <zs...@gmail.com>.
Hi Stack,

I have sent the TeraDataGen and TeraDataSort code to you in another email to
you duboce.net address. Please check for reference.

1. The keys of TeraDataGen are not binary, they are displayable characters
from ASCII code ' '(space) to '~'.
The format if each row is: (10 bytes key) (10 bytes rowid) (78 bytes filler)
\r\n
The keys are random characters from the set ' ' .. '~'.
The rowid is the right justified row id as a int.
The filler consists of 7 runs of 10 characters from 'A' to 'Z'.

I define a simplest HBase table to store the sorted data: create 't1', {NAME
=> 't', VERSIONS => 1}, the only column is t:v.
RowKey = (10 bytes key)
Column t:v 's value = (10 bytes rowid)(78 bytes filler)\r\n

2. I have done more test, and find:
Because some rowKey have character '<' or/and '>', the web UI cannot rightly
display. But the rowkey is right we I get it by HBase API. May the Web UI
code should be modified.

3. Another question:
I found the format of Region Name in the Web UI is dismembered by comma.
Can I have comma character in the rowkey string?

Regards,
Schubert

On Sun, Mar 1, 2009 at 7:05 AM, stack <st...@duboce.net> wrote:

> This sounds like an interesting exercise.   We should do same on this end
> proving a release on a cluster just before we put it out.
> Are the keys that TeraGen makes binary?  Maybe check its source?
>
> If they are, they'll look odd in the UI and on shell; we don't support them
> in UI and shell (yet) but hbase should operate fine with binary keys.  Is
> it
> not working for you?
>
> St.Ack
>
>
> On Sat, Feb 28, 2009 at 2:56 AM, schubert zhang <zs...@gmail.com> wrote:
>
> > I have being used HBase and Hadoop for 5 months.
> >
> > My testbed have 5node(1mastar and 4slaves)
> > Hadoop-0.19.1
> > HBase-0.19.0
> >
> > 1. I use the TeraGen mapreduce job of hadoop examples, to generate files
> > with random key-value paires.
> >    I just create a 1G data and  another 10G data for later test.
> >
> > 2. Then write a job to read these TeraGen files and insert each record's
> > key-value to a HBase table.
> >    (create 'sort1g', {NAME => 't', VERSIONS => 1}
> >     (create 'sort10g', {NAME => 't', VERSIONS => 1}
> >    I want use this insert jobs to simulate the TeraSort, since HBase
> > automatically sort rows.
> >
> > 3. after finish the insert jobs. On the web interface of HBase, I found
> > following strange thing:
> >
> > Name Region Server Encoded Name Start Key End Key
> > ......
> > sort10g,%ql`{^8Bcf,1235730412828   nd2-rack0-cloud:60020   155375382
> >  %ql`{^8Bcf   &YK&Uop0a=
> > sort10g,&YK&Uop0a=,1235730749832  nd1-rack0-cloud:60020  1574155935
> >  &YK&Uop0a=  'B'Zp+!]Tb
> > sort10g,'B'Zp+!]Tb,1235730749832  nd1-rack0-cloud:60020  395792177
> >  'B'Zp+!]Tb  ()o:
> > sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> > sort10g,(qYp"7;j2$,1235730731006  nd1-rack0-cloud:60020  2143364419
> >  (qYp"7;j2$  )Z/?>:ZM3Z
> > sort10g,)Z/?>:ZM3Z,1235730853698  nd2-rack0-cloud:60020  440987412
> >  )Z/?>:ZM3Z  *BuVHF#1ME
> > .......
> > sort10g,:Qt-(8;Y>i,1235730441379   nd1-rack0-cloud:60020   1461025497
> >  :Qt-(8;Y>i   ;;Vg!IT[d"
> > sort10g,;;Vg!IT[d",1235730461102  nd1-rack0-cloud:60020  36776992
> >  ;;Vg!IT[d"  <$#
> > sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> > sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> > sort10g,=VyK?xTtI`,1235730334262  nd3-rack0-cloud:60020  1165072084
> >  =VyK?xTtI`  >A274Dj=vU
> >  .......
> > sort10g,s#Y}pGP|{3,1235730476424   nd1-rack0-cloud:60020   1728348677
> >  s#Y}pGP|{3   soWA+0=0Ao
> > sort10g,soWA+0=0Ao,1235730487163  nd1-rack0-cloud:60020  1275380223
> >  soWA+0=0Ao  t\<
> > sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
> > sort10g,uI-1OW2g=t,1235730515195  nd1-rack0-cloud:60020  232566103
> >  uI-1OW2g=t  v6'-_5E]7'
> >
> >
> > In above lines, some look not like normal:
> > sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> > sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> > sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> > sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
> >
> >
> > Coud you please tell me it is right or not.
> >
>

Re: strange region name, is it right?

Posted by stack <st...@duboce.net>.
This sounds like an interesting exercise.   We should do same on this end
proving a release on a cluster just before we put it out.
Are the keys that TeraGen makes binary?  Maybe check its source?

If they are, they'll look odd in the UI and on shell; we don't support them
in UI and shell (yet) but hbase should operate fine with binary keys.  Is it
not working for you?

St.Ack


On Sat, Feb 28, 2009 at 2:56 AM, schubert zhang <zs...@gmail.com> wrote:

> I have being used HBase and Hadoop for 5 months.
>
> My testbed have 5node(1mastar and 4slaves)
> Hadoop-0.19.1
> HBase-0.19.0
>
> 1. I use the TeraGen mapreduce job of hadoop examples, to generate files
> with random key-value paires.
>    I just create a 1G data and  another 10G data for later test.
>
> 2. Then write a job to read these TeraGen files and insert each record's
> key-value to a HBase table.
>    (create 'sort1g', {NAME => 't', VERSIONS => 1}
>     (create 'sort10g', {NAME => 't', VERSIONS => 1}
>    I want use this insert jobs to simulate the TeraSort, since HBase
> automatically sort rows.
>
> 3. after finish the insert jobs. On the web interface of HBase, I found
> following strange thing:
>
> Name Region Server Encoded Name Start Key End Key
> ......
> sort10g,%ql`{^8Bcf,1235730412828   nd2-rack0-cloud:60020   155375382
>  %ql`{^8Bcf   &YK&Uop0a=
> sort10g,&YK&Uop0a=,1235730749832  nd1-rack0-cloud:60020  1574155935
>  &YK&Uop0a=  'B'Zp+!]Tb
> sort10g,'B'Zp+!]Tb,1235730749832  nd1-rack0-cloud:60020  395792177
>  'B'Zp+!]Tb  ()o:
> sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> sort10g,(qYp"7;j2$,1235730731006  nd1-rack0-cloud:60020  2143364419
>  (qYp"7;j2$  )Z/?>:ZM3Z
> sort10g,)Z/?>:ZM3Z,1235730853698  nd2-rack0-cloud:60020  440987412
>  )Z/?>:ZM3Z  *BuVHF#1ME
> .......
> sort10g,:Qt-(8;Y>i,1235730441379   nd1-rack0-cloud:60020   1461025497
>  :Qt-(8;Y>i   ;;Vg!IT[d"
> sort10g,;;Vg!IT[d",1235730461102  nd1-rack0-cloud:60020  36776992
>  ;;Vg!IT[d"  <$#
> sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> sort10g,=VyK?xTtI`,1235730334262  nd3-rack0-cloud:60020  1165072084
>  =VyK?xTtI`  >A274Dj=vU
>  .......
> sort10g,s#Y}pGP|{3,1235730476424   nd1-rack0-cloud:60020   1728348677
>  s#Y}pGP|{3   soWA+0=0Ao
> sort10g,soWA+0=0Ao,1235730487163  nd1-rack0-cloud:60020  1275380223
>  soWA+0=0Ao  t\<
> sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
> sort10g,uI-1OW2g=t,1235730515195  nd1-rack0-cloud:60020  232566103
>  uI-1OW2g=t  v6'-_5E]7'
>
>
> In above lines, some look not like normal:
> sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
>
>
> Coud you please tell me it is right or not.
>