You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by ZhiHong Fu <dd...@gmail.com> on 2008/11/11 03:16:35 UTC
Mapper value is null problem

Hello:

         I have customized a DbRecordAndOpInputFormt which will retrieve
data from several web Services And the Data format is like the dataItem in
Database ResultSets.
And Now I have encountered a problem, I get right (key,value) in
DbRecordReader next() method, But In Mapper map(key,value) method, key is
right, but value is
null, And I reference to the
http://svn.eu.apache.org/viewvc/hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/lib/db/DbRecordInputFormat
.

At first , I think maybe i don't rewrite the value's write(DataOutput out)
and read(DataInput in) method, but now I have rewritted them , It still
throw the same Exception, "Value is null", I'm very confused, can anyone
help me? thanks.

The Value Instance of DbRecordAndOp, I have implemented like this:
package zju.edu.tcmsearch.lucene.index.format;

import java.util.ArrayList;
import java.util.List;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.Text;

public class DbRecordAndOp implements Writable {

    private String recordId;
    private String dbResQname;
    private String tableName;
    private String showContent;
    private String clobContent;
    private String ontoIdentity;
    private String ontoName;
    private String primaryKey;
    private String  joinTables;

    private int opType;

    public DbRecordAndOp(){};
    public String getRecordId() {
        return recordId;
    }
    public void setRecordId(String recordId) {
        this.recordId = recordId;
    }
    public String getDbResQname() {
        return dbResQname;
    }
    public void setDbResQname(String dbResQname) {
        this.dbResQname = dbResQname;
    }
    public String getTableName() {
        return tableName;
    }
    public void setTableName(String tableName) {
        this.tableName = tableName;
    }
    public String getShowContent() {
        return showContent;
    }
    public void setShowContent(String showContent) {
        this.showContent = showContent;
    }
    public String getClobContent() {
        return clobContent;
    }
    public void setClobContent(String clobContent) {
        this.clobContent = clobContent;
    }
    public String getOntoIdentity() {
        return ontoIdentity;
    }
    public void setOntoIdentity(String ontoIdentity) {
        this.ontoIdentity = ontoIdentity;
    }
    public String getOntoName() {
        return ontoName;
    }
    public void setOntoName(String ontoName) {
        this.ontoName = ontoName;
    }
    public String getPrimaryKey() {
        return primaryKey;
    }
    public void setPrimaryKey(String primaryKey) {
        this.primaryKey = primaryKey;
    }
    public String getJoinTables() {
        return joinTables;
    }
    public void setJoinTables(String joinTables) {
        this.joinTables = joinTables;
    }
    public int getOpType() {
        return opType;
    }
    public void setOpType(int opType) {
        this.opType = opType;
    }

    public void write(DataOutput out) throws IOException {
        System.out.println("DbRecordAndOp write method is invoking......!");
        Text.writeString(out, this.recordId);
        Text.writeString(out, this.dbResQname);
        Text.writeString(out, this.tableName);
        Text.writeString(out, this.showContent);
        Text.writeString(out, this.clobContent);
        Text.writeString(out, this.ontoIdentity);
        Text.writeString(out, this.ontoName);
        Text.writeString(out, this.primaryKey);
        Text.writeString(out, this.joinTables);
//        throw new IOException(this.getClass().getName()
//                + ".write should never be called");
    }

    public void readFields(DataInput in) throws IOException{
        System.out.println("DbRecordAndOp readFields method is
invoking.....!");
        this.recordId=Text.readString(in);
        this.dbResQname=Text.readString(in);
        this.tableName=Text.readString(in);
        this.showContent=Text.readString(in);
        this.clobContent=Text.readString(in);
        this.ontoIdentity=Text.readString(in);
        this.ontoName=Text.readString(in);
        this.primaryKey=Text.readString(in);
        this.joinTables=Text.readString(in);
//        throw new IOException(this.getClass().getName()+".readFields shoud
never be called");
    }

}


And The Exception is like this:

08/11/11 09:56:38 INFO format.DbRecordInputFormat: DbRecordInputFormat init
method is invoking....!
08/11/11 09:56:38 INFO format.DbRecordInputFormat: DbRecordList size is 1000
08/11/11 09:56:38 INFO mapred.IndexUpdater: mapred.map.tasks = 2
08/11/11 09:56:38 INFO mapred.IndexUpdater: mapred.reduce.tasks = 4
08/11/11 09:56:38 INFO mapred.IndexUpdater: 4 shards = -1@IndexPath0
/shard00000@-1,-1@IndexPath0/shard00001@-1,-1@IndexPath0/shard00002@
-1,-1@IndexPath0/shard00003@-1
08/11/11 09:56:38 INFO mapred.IndexUpdater: mapred.input.format.class =
zju.edu.tcmsearch.lucene.index.format.DbRecordInputFormat
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.Shard
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.IntermediateForm
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.Shard
08/11/11 09:56:38 INFO mapred.IndexUpdater: class org.apache.hadoop.io.Text
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdateMapper
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdatePartitioner
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdateCombiner
08/11/11 09:56:38 INFO mapred.IndexUpdater: class
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdateReducer
08/11/11 09:56:38 INFO mapred.IndexUpdater: Output Path:outputPath0
08/11/11 09:56:38 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
08/11/11 09:56:38 INFO format.DbRecordInputFormat: DbRecordInputFormat
getSplits() method is invoking!
08/11/11 09:56:38 INFO format.DbRecordInputFormat: Record number is 1000
08/11/11 09:56:38 INFO format.DbRecordInputFormat: Split Size is 500
08/11/11 09:56:38 INFO format.DbRecordInputFormat: For Split[0], start index
is 0; end index is 500
08/11/11 09:56:38 INFO format.DbRecordInputFormat: For Split[1], start index
is 500; end index is 999
08/11/11 09:56:38 INFO mapred.JobClient: Running job: job_200811110916_0003
08/11/11 09:56:39 INFO mapred.JobClient:  map 0% reduce 0%
08/11/11 09:56:47 INFO mapred.JobClient: Task Id :
attempt_200811110916_0003_m_000000_0, Status : FAILED
java.lang.NullPointerException: value cannot be null
        at org.apache.lucene.document.Field.<init>(Field.java:229)
        at org.apache.lucene.document.Field.<init>(Field.java:205)
        at
zju.edu.tcmsearch.lucene.index.format.DbRecordLocalAnalysis.map(DbRecordLocalAnalysis.java:34)
        at
zju.edu.tcmsearch.lucene.index.format.DbRecordLocalAnalysis.map(DbRecordLocalAnalysis.java:18)
        at
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdateMapper.map(IndexUpdateMapper.java:103)
        at
zju.edu.tcmsearch.lucene.index.mapred.IndexUpdateMapper.map(IndexUpdateMapper.java:42)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

attempt_200811110916_0003_m_000000_0: createKey method is invoking!
attempt_200811110916_0003_m_000000_0: createValue method is invoking!
attempt_200811110916_0003_m_000000_0: Offset is 0
attempt_200811110916_0003_m_000000_0: The start of Split is 0
attempt_200811110916_0003_m_000000_0: The end of Split is 500
attempt_200811110916_0003_m_000000_0: Key is
attempt_200811110916_0003_m_000000_0: Value is
zju.edu.tcmsearch.lucene.index.format.DbRecordAndOp@18aaa1e
attempt_200811110916_0003_m_000000_0: DbRecordAndOpList is null!
attempt_200811110916_0003_m_000000_0: table0
attempt_200811110916_0003_m_000000_0: Offset is 0
attempt_200811110916_0003_m_000000_0: Key is
zju.edu.tcmsearch.lucene.index.mapred.DocumentID[DbRecord0]
attempt_200811110916_0003_m_000000_0: DbRecordAndOp info is:
attempt_200811110916_0003_m_000000_0: DbResQName is {
http://www.dart.zju.edu.cn}db1
attempt_200811110916_0003_m_000000_0: TableName is table0
attempt_200811110916_0003_m_000000_0: OntologyUri is ontoUri0
attempt_200811110916_0003_m_000000_0: OntoName is ontoName0
attempt_200811110916_0003_m_000000_0: Show Content isCCNT, DartGrid , CCNT,
DartGrid
attempt_200811110916_0003_m_000000_0: After next() method, Offset is 1
attempt_200811110916_0003_m_000000_0: Key is
zju.edu.tcmsearch.lucene.index.mapred.DocumentID[DbRecord0]