You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/06/06 08:51:54 UTC

[jira] Resolved: (HBASE-2615) M/R on bulk imported tables

     [ https://issues.apache.org/jira/browse/HBASE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-2615.
--------------------------

    Resolution: Fixed

Closing.  It builds fine now on hudson after my fixup.

> M/R on bulk imported tables
> ---------------------------
>
>                 Key: HBASE-2615
>                 URL: https://issues.apache.org/jira/browse/HBASE-2615
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.20.4
>         Environment: os.arch=amd64; os.version=2.6.9-67.ELsmp; java.version=1.6.0_15; java.vendor=Sun Microsystems Inc.
>            Reporter: Azza Abouzeid
>            Assignee: stack
>             Fix For: 0.20.5, 0.21.0
>
>         Attachments: 2615.txt, dummydata.tar.gz
>
>
> We are bulk importing using loadtable.rb and running M/R jobs using HBase as input.
> We're taking the following steps:
> 1a. Load HBase with a M/R job using the normal API. 
> OR
> 1b. Load HBase with bulk import.
> THEN
> 2a. Using the shell, do a "count" over the table.
> OR
> 2b. Run a M/R job that scans the whole HBase table (and nothing else).
> Of the 4 combos, 3 are fine: 1a+2a, 1a+2b, 1b+2a.  We're having trouble with 1b+2b.  When we run the M/R job, it doesn't seem to read in any records, but there are no explicit errors in either the Hadoop or HBase logs.
> Any ideas on what might be wrong with the bulk import to cause this problem?  We confirmed this problem exists in both hbase-0.20.3 and hbase-0.20.4.
> We have created dummy data (see attached). This is the test case:
> After loading the data into HDFS. In hbase shell:
> create 'tiny', 'values'
> Execute: 
> {HBASE-HOME}/bin/hbase org.jruby.Main {HBASE-HOME}/bin/loadtable.rb tiny tinytable
> Then run the simple row counter
> {HADOOP-HOME}/bin/hadoop jar {HBASE-HOME}/hbase-0.20.x.jar rowcounter tiny values
> Notice that map input records read is always zero. We confirmed that other mapreduce jobs do not execute the map function at all, always returning 0 records.
> We also ran a major_compaction of all Hbase tables (.META. and .ROOT. as well) but this did not fix the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.