You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "R.L." <xm...@gmail.com> on 2011/11/26 18:17:37 UTC

Why the map reduce job shows 17125 records wrote but hbase table only have 6790 rows?

I use hbase 0.93, hadoop 0.20.2.
Here is the mapreduce job output:

11/11/27 01:04:32 INFO mapred.JobClient: Running job: job_201111261309_0026
11/11/27 01:04:33 INFO mapred.JobClient:  map 0% reduce 0%
11/11/27 01:07:35 INFO mapred.JobClient:  map 100% reduce 0%
11/11/27 01:07:37 INFO mapred.JobClient: Job complete: job_201111261309_0026
11/11/27 01:07:37 INFO mapred.JobClient: Counters: 5
11/11/27 01:07:37 INFO mapred.JobClient:   Job Counters 
11/11/27 01:07:37 INFO mapred.JobClient:     Rack-local map tasks=1
11/11/27 01:07:37 INFO mapred.JobClient:     Launched map tasks=1
11/11/27 01:07:37 INFO mapred.JobClient:   Map-Reduce Framework
11/11/27 01:07:37 INFO mapred.JobClient:     Map input records=60723
11/11/27 01:07:37 INFO mapred.JobClient:     Spilled Records=0
11/11/27 01:07:37 INFO mapred.JobClient:     Map output records=17125

This job extract the data from one table which has 60723 records into another 
new one. And I also log all these records and the log file looks fine. But the 
output table only have 6790 when the job done. 

Here is the hbase log from regionserver which the output table belongs to:

============================================================================
2011-11-27 01:01:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270, 
accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791, 
cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0, 
evictedPerRun=NaN
2011-11-27 01:04:30,846 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: 
Received request to open 1 region(s)
2011-11-27 01:04:30,846 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: 
Received request to open region: 
doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
2011-11-27 01:04:30,846 DEBUG 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open 
of doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
2011-11-27 01:04:30,846 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Attempting to transition node 
47107b5eed754b4e2eba88dc898eef9b from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2011-11-27 01:04:30,848 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Successfully transitioned node 
47107b5eed754b4e2eba88dc898eef9b from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2011-11-27 01:04:30,848 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Opening region: REGION => {NAME => 
'doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.', 
STARTKEY => '', ENDKEY => '', ENCODED => 47107b5eed754b4e2eba88dc898eef9b, TABLE 
=> {{NAME => 'doctor2_suggestions_index', FAMILIES => [{NAME => 'fm.suggestion', 
BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS 
=> '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', 
BLOCKCACHE => 'true'}]}}
2011-11-27 01:04:30,849 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Instantiated 
doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
2011-11-27 01:04:30,853 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Onlined 
doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.; next 
sequenceid=1
2011-11-27 01:04:30,853 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Attempting to transition node 
47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to 
RS_ZK_REGION_OPENING
2011-11-27 01:04:30,854 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Successfully transitioned node 
47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to 
RS_ZK_REGION_OPENING
2011-11-27 01:04:30,870 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated 
row doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b. 
in region .META.,,1 with server=pipe2:60020, startcode=1322296597658
2011-11-27 01:04:30,870 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Attempting to transition node 
47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to 
RS_ZK_REGION_OPENED
2011-11-27 01:04:30,875 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x133df04d2e20006 Successfully transitioned node 
47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to 
RS_ZK_REGION_OPENED
2011-11-27 01:04:30,875 DEBUG 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened 
doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
2011-11-27 01:06:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270, 
accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791, 
cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0, 
evictedPerRun=NaN
2011-11-27 01:11:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270, 
accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791, 
cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0, 
evictedPerRun=NaN



Re: Why the map reduce job shows 17125 records wrote but hbase table only have 6790 rows?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
If it really reports at the MapReduce level that it has inserted less
rows than you expected, then it's most probably an error in your code
or a misunderstanding of some sort.

Looking at the code itself might help us helping you.

J-D

On Sat, Nov 26, 2011 at 9:17 AM, R.L. <xm...@gmail.com> wrote:
> I use hbase 0.93, hadoop 0.20.2.
> Here is the mapreduce job output:
>
> 11/11/27 01:04:32 INFO mapred.JobClient: Running job: job_201111261309_0026
> 11/11/27 01:04:33 INFO mapred.JobClient:  map 0% reduce 0%
> 11/11/27 01:07:35 INFO mapred.JobClient:  map 100% reduce 0%
> 11/11/27 01:07:37 INFO mapred.JobClient: Job complete: job_201111261309_0026
> 11/11/27 01:07:37 INFO mapred.JobClient: Counters: 5
> 11/11/27 01:07:37 INFO mapred.JobClient:   Job Counters
> 11/11/27 01:07:37 INFO mapred.JobClient:     Rack-local map tasks=1
> 11/11/27 01:07:37 INFO mapred.JobClient:     Launched map tasks=1
> 11/11/27 01:07:37 INFO mapred.JobClient:   Map-Reduce Framework
> 11/11/27 01:07:37 INFO mapred.JobClient:     Map input records=60723
> 11/11/27 01:07:37 INFO mapred.JobClient:     Spilled Records=0
> 11/11/27 01:07:37 INFO mapred.JobClient:     Map output records=17125
>
> This job extract the data from one table which has 60723 records into another
> new one. And I also log all these records and the log file looks fine. But the
> output table only have 6790 when the job done.
>
> Here is the hbase log from regionserver which the output table belongs to:
>
> ============================================================================
> 2011-11-27 01:01:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache:
> LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270,
> accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791,
> cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0,
> evictedPerRun=NaN
> 2011-11-27 01:04:30,846 INFO org.apache.hadoop.hbase.regionserver.HRegionServer:
> Received request to open 1 region(s)
> 2011-11-27 01:04:30,846 INFO org.apache.hadoop.hbase.regionserver.HRegionServer:
> Received request to open region:
> doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
> 2011-11-27 01:04:30,846 DEBUG
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open
> of doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
> 2011-11-27 01:04:30,846 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Attempting to transition node
> 47107b5eed754b4e2eba88dc898eef9b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 2011-11-27 01:04:30,848 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Successfully transitioned node
> 47107b5eed754b4e2eba88dc898eef9b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 2011-11-27 01:04:30,848 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Opening region: REGION => {NAME =>
> 'doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.',
> STARTKEY => '', ENDKEY => '', ENCODED => 47107b5eed754b4e2eba88dc898eef9b, TABLE
> => {{NAME => 'doctor2_suggestions_index', FAMILIES => [{NAME => 'fm.suggestion',
> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS
> => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE => 'true'}]}}
> 2011-11-27 01:04:30,849 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Instantiated
> doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
> 2011-11-27 01:04:30,853 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Onlined
> doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.; next
> sequenceid=1
> 2011-11-27 01:04:30,853 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Attempting to transition node
> 47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENING
> 2011-11-27 01:04:30,854 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Successfully transitioned node
> 47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENING
> 2011-11-27 01:04:30,870 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated
> row doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
> in region .META.,,1 with server=pipe2:60020, startcode=1322296597658
> 2011-11-27 01:04:30,870 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Attempting to transition node
> 47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENED
> 2011-11-27 01:04:30,875 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x133df04d2e20006 Successfully transitioned node
> 47107b5eed754b4e2eba88dc898eef9b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENED
> 2011-11-27 01:04:30,875 DEBUG
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened
> doctor2_suggestions_index,,1322327070776.47107b5eed754b4e2eba88dc898eef9b.
> 2011-11-27 01:06:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache:
> LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270,
> accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791,
> cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0,
> evictedPerRun=NaN
> 2011-11-27 01:11:42,590 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache:
> LRU Stats: total=18.37 MB, free=181.21 MB, max=199.59 MB, blocks=270,
> accesses=1223390, hits=1219632, hitRatio=99.69%%, cachingAccesses=1219791,
> cachingHits=1219521, cachingHitsRatio=99.97%%, evictions=0, evicted=0,
> evictedPerRun=NaN
>
>
>