You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Joan <jo...@gmail.com> on 2011/01/14 13:57:35 UTC

how to write custom object using M/R

Hi,

I'm trying to write (K,V) where K is a Text object and V's CustomObject. But
It doesn't run.

I'm configuring output job like: SequenceFileInputFormat so I have job with:

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(CustomObject.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(CustomObject.class);

        SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");

And I obtain the next output (this is a file: part-r-00000):

K  CustomObject@2b237512
K  CustomObject@24db06de
...

When this job finished I run other job which input is
SequenceFileInputFormat but It doesn't run:

The configuration's second job is:

        job.setInputFormatClass(SequenceFileInputFormat.class);
        SequenceFileInputFormat.addInputPath(job, new Path("myPath"));

But I get an error:

java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not
a SequenceFile
        at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
        at
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)


Can someone help me? Because I don't understand it. I don't know to save my
object in first M/R and how to use it in second M/R

Thanks

Joan

Re: how to write custom object using M/R

Posted by David Rosenstrauch <da...@darose.net>.
Sounds to me like your custom object isn't serializing properly.

You might want to read up on how to do it correctly here: 
http://developer.yahoo.com/hadoop/tutorial/module5.html#types

FYI - here's an example of a custom type I wrote, which I'm able to 
read/write successfully to/from a sequence file:


public class UserStateRecordWritable implements Writable {

	public UserStateRecordWritable() {
		recordType = new Text();
		recordData = new BytesWritable();
	}

	public void readFields(DataInput in) throws IOException {
		recordType.readFields(in);
		recordData.readFields(in);
	}

	public void write(DataOutput out) throws IOException {
		recordType.write(out);
		recordData.write(out);
	}

	public void set(Text newRecordType, BytesWritable newRecordData) {
		recordType.set(newRecordType);
		recordData.set(newRecordData);
	}

	public Text getRecordType() {
		return recordType;
	}

	public BytesWritable getRecordData() {
		return recordData;
	}

	public String copyRecordType() {
		return recordType.toString();
	}

	public byte[] copyRecordData() {
		return TraitWeightUtils.getBytes(recordData);
	}

	private Text recordType;
	private BytesWritable recordData;
}


HTH,

DR

On 01/14/2011 07:57 AM, Joan wrote:
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject. But
> It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job with:
>
>          job.setMapOutputKeyClass(Text.class);
>          job.setMapOutputValueClass(CustomObject.class);
>          job.setOutputKeyClass(Text.class);
>          job.setOutputValueClass(CustomObject.class);
>
>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is
> SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>          job.setInputFormatClass(SequenceFileInputFormat.class);
>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not
> a SequenceFile
>          at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>          at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my
> object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>


Re: how to write custom object using M/R

Posted by David Rosenstrauch <da...@darose.net>.
Maybe change "id" to be an IntWritable, and "str" to be a Text?

HTH,

DR

On 01/19/2011 09:36 AM, Joan wrote:
> Hi Lance,
>
> My custom object has Writable implement but I don't overrride toString
> method?
>
> *public class MyWritable implements DBWritable, Writable, Cloneable  {
>
>      int id;
>      String str;
>
>      @Override
>      public void readFields(ResultSet rs) throws SQLException {
>
>          id = rs.getInt(1);
>          str = rs.getString(2);
>      }
>
>      @Override
>      public void write(PreparedStatement pstmt) throws SQLException {
>          // do nothing
>      }
>
>      @Override
>      public void readFields(DataInput in) throws IOException {
>          id = in.readInt();
>          str = Text.readString(in);
>      }
>
>      @Override
>      public void write(DataOutput out) throws IOException {
>
>          out.writeInt(id);
>          Text.writeString(out, str);
>      }
> }*
>
> But I don't understand why not serialize object,
>
> Thanks
>
> Joan
>
>
>
> 2011/1/17 Lance Norskog<go...@gmail.com>
>
>> Does you custom object have Writable implemented? Also, does it have
>> toString() implemented? I think this means the Writable code does not
>> work:
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>>
>> This is Java's default toString() method.
>>
>> On Mon, Jan 17, 2011 at 12:19 AM, Joan<jo...@gmail.com>  wrote:
>>> Hi Alain,
>>>
>>> I put it, but It didn't work.
>>>
>>> Joan
>>>
>>> 2011/1/14 MONTMORY Alain<al...@thalesgroup.com>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I think you have to put :
>>>>
>>>>              job.setOutputFormatClass(SequenceFileOutputFormat.class);
>>>>
>>>> to make it works..
>>>>
>>>> hopes this help
>>>>
>>>>
>>>>
>>>> Alain
>>>>
>>>>
>>>>
>>>> [@@THALES GROUP RESTRICTED@@]
>>>>
>>>>
>>>>
>>>> De : Joan [mailto:joan.monplet@gmail.com]
>>>> Envoyé : vendredi 14 janvier 2011 13:58
>>>> À : mapreduce-user
>>>> Objet : how to write custom object using M/R
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>>>> But It doesn't run.
>>>>
>>>> I'm configuring output job like: SequenceFileInputFormat so I have job
>>>> with:
>>>>
>>>>          job.setMapOutputKeyClass(Text.class);
>>>>          job.setMapOutputValueClass(CustomObject.class);
>>>>          job.setOutputKeyClass(Text.class);
>>>>          job.setOutputValueClass(CustomObject.class);
>>>>
>>>>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>>>
>>>> And I obtain the next output (this is a file: part-r-00000):
>>>>
>>>> K  CustomObject@2b237512
>>>> K  CustomObject@24db06de
>>>> ...
>>>>
>>>> When this job finished I run other job which input is
>>>> SequenceFileInputFormat but It doesn't run:
>>>>
>>>> The configuration's second job is:
>>>>
>>>>          job.setInputFormatClass(SequenceFileInputFormat.class);
>>>>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>>>
>>>> But I get an error:
>>>>
>>>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>>>> not a SequenceFile
>>>>          at
>>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>>>          at
>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>>>          at
>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>>>          at
>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>>>          at
>>>>
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>>>
>>>>
>>>> Can someone help me? Because I don't understand it. I don't know to save
>>>> my object in first M/R and how to use it in second M/R
>>>>
>>>> Thanks
>>>>
>>>> Joan
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>


Re: how to write custom object using M/R

Posted by Joan <jo...@gmail.com>.
Hi Lance,

My custom object has Writable implement but I don't overrride toString
method?

*public class MyWritable implements DBWritable, Writable, Cloneable  {

    int id;
    String str;

    @Override
    public void readFields(ResultSet rs) throws SQLException {

        id = rs.getInt(1);
        str = rs.getString(2);
    }

    @Override
    public void write(PreparedStatement pstmt) throws SQLException {
        // do nothing
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        id = in.readInt();
        str = Text.readString(in);
    }

    @Override
    public void write(DataOutput out) throws IOException {

        out.writeInt(id);
        Text.writeString(out, str);
    }
}*

But I don't understand why not serialize object,

Thanks

Joan



2011/1/17 Lance Norskog <go...@gmail.com>

> Does you custom object have Writable implemented? Also, does it have
> toString() implemented? I think this means the Writable code does not
> work:
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
>
> This is Java's default toString() method.
>
> On Mon, Jan 17, 2011 at 12:19 AM, Joan <jo...@gmail.com> wrote:
> > Hi Alain,
> >
> > I put it, but It didn't work.
> >
> > Joan
> >
> > 2011/1/14 MONTMORY Alain <al...@thalesgroup.com>
> >>
> >> Hi,
> >>
> >>
> >>
> >> I think you have to put :
> >>
> >>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
> >>
> >> to make it works..
> >>
> >> hopes this help
> >>
> >>
> >>
> >> Alain
> >>
> >>
> >>
> >> [@@THALES GROUP RESTRICTED@@]
> >>
> >>
> >>
> >> De : Joan [mailto:joan.monplet@gmail.com]
> >> Envoyé : vendredi 14 janvier 2011 13:58
> >> À : mapreduce-user
> >> Objet : how to write custom object using M/R
> >>
> >>
> >>
> >> Hi,
> >>
> >> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
> >> But It doesn't run.
> >>
> >> I'm configuring output job like: SequenceFileInputFormat so I have job
> >> with:
> >>
> >>         job.setMapOutputKeyClass(Text.class);
> >>         job.setMapOutputValueClass(CustomObject.class);
> >>         job.setOutputKeyClass(Text.class);
> >>         job.setOutputValueClass(CustomObject.class);
> >>
> >>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
> >>
> >> And I obtain the next output (this is a file: part-r-00000):
> >>
> >> K  CustomObject@2b237512
> >> K  CustomObject@24db06de
> >> ...
> >>
> >> When this job finished I run other job which input is
> >> SequenceFileInputFormat but It doesn't run:
> >>
> >> The configuration's second job is:
> >>
> >>         job.setInputFormatClass(SequenceFileInputFormat.class);
> >>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
> >>
> >> But I get an error:
> >>
> >> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
> >> not a SequenceFile
> >>         at
> >> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
> >>         at
> >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
> >>         at
> >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
> >>         at
> >> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
> >>         at
> >>
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
> >>
> >>
> >> Can someone help me? Because I don't understand it. I don't know to save
> >> my object in first M/R and how to use it in second M/R
> >>
> >> Thanks
> >>
> >> Joan
> >>
> >>
> >
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: how to write custom object using M/R

Posted by Lance Norskog <go...@gmail.com>.
Does you custom object have Writable implemented? Also, does it have
toString() implemented? I think this means the Writable code does not
work:

K  CustomObject@2b237512
K  CustomObject@24db06de

This is Java's default toString() method.

On Mon, Jan 17, 2011 at 12:19 AM, Joan <jo...@gmail.com> wrote:
> Hi Alain,
>
> I put it, but It didn't work.
>
> Joan
>
> 2011/1/14 MONTMORY Alain <al...@thalesgroup.com>
>>
>> Hi,
>>
>>
>>
>> I think you have to put :
>>
>>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
>>
>> to make it works..
>>
>> hopes this help
>>
>>
>>
>> Alain
>>
>>
>>
>> [@@THALES GROUP RESTRICTED@@]
>>
>>
>>
>> De : Joan [mailto:joan.monplet@gmail.com]
>> Envoyé : vendredi 14 janvier 2011 13:58
>> À : mapreduce-user
>> Objet : how to write custom object using M/R
>>
>>
>>
>> Hi,
>>
>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>> But It doesn't run.
>>
>> I'm configuring output job like: SequenceFileInputFormat so I have job
>> with:
>>
>>         job.setMapOutputKeyClass(Text.class);
>>         job.setMapOutputValueClass(CustomObject.class);
>>         job.setOutputKeyClass(Text.class);
>>         job.setOutputValueClass(CustomObject.class);
>>
>>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>
>> And I obtain the next output (this is a file: part-r-00000):
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>> ...
>>
>> When this job finished I run other job which input is
>> SequenceFileInputFormat but It doesn't run:
>>
>> The configuration's second job is:
>>
>>         job.setInputFormatClass(SequenceFileInputFormat.class);
>>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>
>> But I get an error:
>>
>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>> not a SequenceFile
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>         at
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>
>>
>> Can someone help me? Because I don't understand it. I don't know to save
>> my object in first M/R and how to use it in second M/R
>>
>> Thanks
>>
>> Joan
>>
>>
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: how to write custom object using M/R

Posted by Harsh J <qw...@gmail.com>.
1. Your first Job's OutputFormat must be set to SequenceFileOutputFormat
2. Your "custom" object must implement the Writable interface properly
(as in, the readFields() and write() methods must work as expected by
the framework and your requirements).

The fact that your output is like "K  CustomObject@2b237512" shows
that the custom object isn't serializing properly (toString() is
probably being called without a special implementation?)

On Mon, Jan 17, 2011 at 1:49 PM, Joan <jo...@gmail.com> wrote:
> Hi Alain,
>
> I put it, but It didn't work.
>
> Joan
>
> 2011/1/14 MONTMORY Alain <al...@thalesgroup.com>
>>
>> Hi,
>>
>>
>>
>> I think you have to put :
>>
>>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
>>
>> to make it works..
>>
>> hopes this help
>>
>>
>>
>> Alain
>>
>>
>>
>> [@@THALES GROUP RESTRICTED@@]
>>
>>
>>
>> De : Joan [mailto:joan.monplet@gmail.com]
>> Envoyé : vendredi 14 janvier 2011 13:58
>> À : mapreduce-user
>> Objet : how to write custom object using M/R
>>
>>
>>
>> Hi,
>>
>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>> But It doesn't run.
>>
>> I'm configuring output job like: SequenceFileInputFormat so I have job
>> with:
>>
>>         job.setMapOutputKeyClass(Text.class);
>>         job.setMapOutputValueClass(CustomObject.class);
>>         job.setOutputKeyClass(Text.class);
>>         job.setOutputValueClass(CustomObject.class);
>>
>>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>
>> And I obtain the next output (this is a file: part-r-00000):
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>> ...
>>
>> When this job finished I run other job which input is
>> SequenceFileInputFormat but It doesn't run:
>>
>> The configuration's second job is:
>>
>>         job.setInputFormatClass(SequenceFileInputFormat.class);
>>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>
>> But I get an error:
>>
>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>> not a SequenceFile
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>         at
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>
>>
>> Can someone help me? Because I don't understand it. I don't know to save
>> my object in first M/R and how to use it in second M/R
>>
>> Thanks
>>
>> Joan
>>
>>
>
>



-- 
Harsh J
www.harshj.com

Re: how to write custom object using M/R

Posted by Joan <jo...@gmail.com>.
Hi Alain,

I put it, but It didn't work.

Joan

2011/1/14 MONTMORY Alain <al...@thalesgroup.com>

>  Hi,
>
>
>
> I think you have to put :
>
>             job.setOutputFormatClass(SequenceFileOutputFormat.*class*);
>
> to make it works..
>
> hopes this help
>
>
>
> Alain
>
>
>
> [@@THALES GROUP RESTRICTED@@]
>
>
>
> *De :* Joan [mailto:joan.monplet@gmail.com]
> *Envoyé :* vendredi 14 janvier 2011 13:58
> *À :* mapreduce-user
> *Objet :* how to write custom object using M/R
>
>
>
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
> But It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job
> with:
>
>         job.setMapOutputKeyClass(Text.class);
>         job.setMapOutputValueClass(CustomObject.class);
>         job.setOutputKeyClass(Text.class);
>         job.setOutputValueClass(CustomObject.class);
>
>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is
> SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>         job.setInputFormatClass(SequenceFileInputFormat.class);
>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
> not a SequenceFile
>         at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>         at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my
> object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>
>
>

Re: how to write custom object using M/R

Posted by Joan <jo...@gmail.com>.
Hi,

I tried but it didnt work.

I don't understand why not it works, I only want that the first reducer
write my object into DHFS and the second mapper reads this object from DHFS.

I'm try to write object with SequenceFileOutFormat and I've have my own
Writable, obviously my object implements Writable, but I continues doesn't
work, and I also put
job.setOutputFormatClass(SequenceFileOutputFormat.class) and
SequenceFileOutputFormat.setOutputPath(conf, outputDir). However, I'm not
using "setOutputCompression".

Joan

2011/1/18 David Rosenstrauch <da...@darose.net>

> I assumed you were already doing this but yes, Alain is correct, you need
> to set the output format too.
>
> I initialize writing to sequence files like so:
>
> job.setOutputFormatClass(SequenceFileOutputFormat.class);
> FileOutputFormat.setOutputName(job, dataSourceName);
> FileOutputFormat.setOutputPath(job, hdfsJobOutputPath);
> FileOutputFormat.setCompressOutput(job, true);
> FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);
> SequenceFileOutputFormat.setOutputCompressionType(job,
> SequenceFile.CompressionType.BLOCK);
>
> DR
>
>
>
> On 01/14/2011 01:27 PM, MONTMORY Alain wrote:
>
>> Hi,
>>
>> I think you have to put :
>>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
>> to make it works..
>> hopes this help
>>
>> Alain
>>
>> [@@THALES GROUP RESTRICTED@@]
>>
>> De : Joan [mailto:joan.monplet@gmail.com]
>> Envoyé : vendredi 14 janvier 2011 13:58
>> À : mapreduce-user
>> Objet : how to write custom object using M/R
>>
>> Hi,
>>
>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>> But It doesn't run.
>>
>> I'm configuring output job like: SequenceFileInputFormat so I have job
>> with:
>>
>>         job.setMapOutputKeyClass(Text.class);
>>         job.setMapOutputValueClass(CustomObject.class);
>>         job.setOutputKeyClass(Text.class);
>>         job.setOutputValueClass(CustomObject.class);
>>
>>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>
>> And I obtain the next output (this is a file: part-r-00000):
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>> ...
>>
>> When this job finished I run other job which input is
>> SequenceFileInputFormat but It doesn't run:
>>
>> The configuration's second job is:
>>
>>         job.setInputFormatClass(SequenceFileInputFormat.class);
>>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>
>> But I get an error:
>>
>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>> not a SequenceFile
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>         at
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>
>>
>> Can someone help me? Because I don't understand it. I don't know to save
>> my object in first M/R and how to use it in second M/R
>>
>> Thanks
>>
>> Joan
>>
>>
>>
>>
>

Re: how to write custom object using M/R

Posted by Joan <jo...@gmail.com>.
2011/1/18 David Rosenstrauch <da...@darose.net>

> I assumed you were already doing this but yes, Alain is correct, you need
> to set the output format too.
>
> I initialize writing to sequence files like so:
>
> job.setOutputFormatClass(SequenceFileOutputFormat.class);
> FileOutputFormat.setOutputName(job, dataSourceName);
> FileOutputFormat.setOutputPath(job, hdfsJobOutputPath);
> FileOutputFormat.setCompressOutput(job, true);
> FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);
> SequenceFileOutputFormat.setOutputCompressionType(job,
> SequenceFile.CompressionType.BLOCK);
>
> DR
>
>
>
> On 01/14/2011 01:27 PM, MONTMORY Alain wrote:
>
>> Hi,
>>
>> I think you have to put :
>>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
>> to make it works..
>> hopes this help
>>
>> Alain
>>
>> [@@THALES GROUP RESTRICTED@@]
>>
>> De : Joan [mailto:joan.monplet@gmail.com]
>> Envoyé : vendredi 14 janvier 2011 13:58
>> À : mapreduce-user
>> Objet : how to write custom object using M/R
>>
>> Hi,
>>
>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>> But It doesn't run.
>>
>> I'm configuring output job like: SequenceFileInputFormat so I have job
>> with:
>>
>>         job.setMapOutputKeyClass(Text.class);
>>         job.setMapOutputValueClass(CustomObject.class);
>>         job.setOutputKeyClass(Text.class);
>>         job.setOutputValueClass(CustomObject.class);
>>
>>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>
>> And I obtain the next output (this is a file: part-r-00000):
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>> ...
>>
>> When this job finished I run other job which input is
>> SequenceFileInputFormat but It doesn't run:
>>
>> The configuration's second job is:
>>
>>         job.setInputFormatClass(SequenceFileInputFormat.class);
>>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>
>> But I get an error:
>>
>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>> not a SequenceFile
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>         at
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>
>>
>> Can someone help me? Because I don't understand it. I don't know to save
>> my object in first M/R and how to use it in second M/R
>>
>> Thanks
>>
>> Joan
>>
>>
>>
>>
>

Re: how to write custom object using M/R

Posted by David Rosenstrauch <da...@darose.net>.
I assumed you were already doing this but yes, Alain is correct, you 
need to set the output format too.

I initialize writing to sequence files like so:

job.setOutputFormatClass(SequenceFileOutputFormat.class);
FileOutputFormat.setOutputName(job, dataSourceName);
FileOutputFormat.setOutputPath(job, hdfsJobOutputPath);
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);
SequenceFileOutputFormat.setOutputCompressionType(job, 
SequenceFile.CompressionType.BLOCK);

DR


On 01/14/2011 01:27 PM, MONTMORY Alain wrote:
> Hi,
>
> I think you have to put :
>              job.setOutputFormatClass(SequenceFileOutputFormat.class);
> to make it works..
> hopes this help
>
> Alain
>
> [@@THALES GROUP RESTRICTED@@]
>
> De : Joan [mailto:joan.monplet@gmail.com]
> Envoyé : vendredi 14 janvier 2011 13:58
> À : mapreduce-user
> Objet : how to write custom object using M/R
>
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job with:
>
>          job.setMapOutputKeyClass(Text.class);
>          job.setMapOutputValueClass(CustomObject.class);
>          job.setOutputKeyClass(Text.class);
>          job.setOutputValueClass(CustomObject.class);
>
>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>          job.setInputFormatClass(SequenceFileInputFormat.class);
>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile
>          at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>          at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>          at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>
>
>


RE: how to write custom object using M/R

Posted by MONTMORY Alain <al...@thalesgroup.com>.
Hi,

I think you have to put :
            job.setOutputFormatClass(SequenceFileOutputFormat.class);
to make it works..
hopes this help

Alain

[@@THALES GROUP RESTRICTED@@]

De : Joan [mailto:joan.monplet@gmail.com]
Envoyé : vendredi 14 janvier 2011 13:58
À : mapreduce-user
Objet : how to write custom object using M/R

Hi,

I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run.

I'm configuring output job like: SequenceFileInputFormat so I have job with:

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(CustomObject.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(CustomObject.class);

        SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");

And I obtain the next output (this is a file: part-r-00000):

K  CustomObject@2b237512
K  CustomObject@24db06de
...

When this job finished I run other job which input is SequenceFileInputFormat but It doesn't run:

The configuration's second job is:

        job.setInputFormatClass(SequenceFileInputFormat.class);
        SequenceFileInputFormat.addInputPath(job, new Path("myPath"));

But I get an error:

java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not a SequenceFile
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
        at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)


Can someone help me? Because I don't understand it. I don't know to save my object in first M/R and how to use it in second M/R

Thanks

Joan