You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Kunal Gupta <ku...@techlead-india.com> on 2009/12/01 07:57:49 UTC

Re: How to write a custom input format and record reader to read multiple lines of text from files

Can you kindly guide me on what initialisation i need to do in the
implemented class constructor - MultiLineFileInputFormat?

i was following the sample provided on this yahoo page:

http://developer.yahoo.com/hadoop/tutorial/module5.html#fileformat




On Tue, 2009-12-01 at 06:45 +0000, Sean Owen wrote:
> It sounds like you have no provided a no-arg constructor in
> MultiLineFileInputFormat.
> 
> On Tue, Dec 1, 2009 at 6:17 AM, Kunal Gupta <ku...@techlead-india.com> wrote:
> > Can someone explain how to override the "FileInputFormat" and
> > "RecordReader" in order to be able to read multiple lines of text from
> > input files in a single map task?
> >
> > Here the key will be the offset of the first line of text and value will
> > be the N lines of text.
> >
> > I have overridden the class FileInputFormat:
> >
> > public class MultiLineFileInputFormat
> >        extends FileInputFormat<LongWritable, Text>{
> > ...
> > }
> >
> > and implemented the abstract method:
> >
> > public RecordReader createRecordReader(InputSplit split,
> >                TaskAttemptContext context)
> >         throws IOException, InterruptedException {...}
> >
> > I have also overridden the recordreader class:
> >
> > public class MultiLineFileRecordReader extends
> > RecordReader<LongWritable, Text>
> > {...}
> >
> > and in the job configuration, specified this new InputFormat class:
> >
> > job.setInputFormatClass(MultiLineFileInputFormat.class);
> >
> > --------------------------------------------------------------------------
> > When I  run this new map/reduce program, i get the following java error:
> > --------------------------------------------------------------------------
> > Exception in thread "main" java.lang.RuntimeException:
> > java.lang.NoSuchMethodException: CustomRecordReader
> > $MultiLineFileInputFormat.<init>()
> >        at
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
> >        at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> >        at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >        at CustomRecordReader.main(CustomRecordReader.java:257)
> > Caused by: java.lang.NoSuchMethodException: CustomRecordReader
> > $MultiLineFileInputFormat.<init>()
> >        at java.lang.Class.getConstructor0(Class.java:2706)
> >        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
> >        at
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
> >        ... 5 more
> >
> >
> 


Re: How to write a custom input format and record reader to read multiple lines of text from files

Posted by Kunal Gupta <ku...@techlead-india.com>.
I have not implemented a constructor for the class extending the
FileInputFormat class.
I had actually implemented a argument constructor for the class
extending RecordReader. This class was not having a no-arg constructor.
After reading your comment i wrote a no-arg constructor for this class,
but still getting the same error.

Following is the class extending the FileInputFormat class:

	public class MultiLineFileInputFormat
	extends FileInputFormat<LongWritable, Text> {

		/*public MultiLineFileInputFormat()
		{
			super();
		}*/
		
		@Override
		public RecordReader createRecordReader(InputSplit split,
                TaskAttemptContext context)
         throws IOException, InterruptedException 
         {
			
				context.setStatus(split.toString());
				return new MultiLineFileRecordReader((FileSplit) split, context);
         }
		
	}


On Tue, 2009-12-01 at 07:20 +0000, Sean Owen wrote:
> It sounds like you have declared a constructor in
> MultiLineFileInputFormat that needs an argument. By doing so, no
> no-arg constructor is automatically generated. Unless you write one,
> it won't exist. The Hadoop framework instantiates your class by
> calling the no-arg constructor. The error you get says this directly:
> there is no no-arg constructor. Write one to fix it.
> 
> The example you reference has a no-arg constructor, by default, since
> it declares no constructors at all.
> 
> On Tue, Dec 1, 2009 at 6:57 AM, Kunal Gupta <ku...@techlead-india.com> wrote:
> > Can you kindly guide me on what initialisation i need to do in the
> > implemented class constructor - MultiLineFileInputFormat?
> >
> > i was following the sample provided on this yahoo page:
> >
> > http://developer.yahoo.com/hadoop/tutorial/module5.html#fileformat
> >
> >
> >
> >
> > On Tue, 2009-12-01 at 06:45 +0000, Sean Owen wrote:
> >> It sounds like you have no provided a no-arg constructor in
> >> MultiLineFileInputFormat.
> >>
> >> On Tue, Dec 1, 2009 at 6:17 AM, Kunal Gupta <ku...@techlead-india.com> wrote:
> >> > Can someone explain how to override the "FileInputFormat" and
> >> > "RecordReader" in order to be able to read multiple lines of text from
> >> > input files in a single map task?
> >> >
> >> > Here the key will be the offset of the first line of text and value will
> >> > be the N lines of text.
> >> >
> >> > I have overridden the class FileInputFormat:
> >> >
> >> > public class MultiLineFileInputFormat
> >> >        extends FileInputFormat<LongWritable, Text>{
> >> > ...
> >> > }
> >> >
> >> > and implemented the abstract method:
> >> >
> >> > public RecordReader createRecordReader(InputSplit split,
> >> >                TaskAttemptContext context)
> >> >         throws IOException, InterruptedException {...}
> >> >
> >> > I have also overridden the recordreader class:
> >> >
> >> > public class MultiLineFileRecordReader extends
> >> > RecordReader<LongWritable, Text>
> >> > {...}
> >> >
> >> > and in the job configuration, specified this new InputFormat class:
> >> >
> >> > job.setInputFormatClass(MultiLineFileInputFormat.class);
> >> >
> >> > --------------------------------------------------------------------------
> >> > When I  run this new map/reduce program, i get the following java error:
> >> > --------------------------------------------------------------------------
> >> > Exception in thread "main" java.lang.RuntimeException:
> >> > java.lang.NoSuchMethodException: CustomRecordReader
> >> > $MultiLineFileInputFormat.<init>()
> >> >        at
> >> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
> >> >        at
> >> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> >> >        at
> >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >> >        at CustomRecordReader.main(CustomRecordReader.java:257)
> >> > Caused by: java.lang.NoSuchMethodException: CustomRecordReader
> >> > $MultiLineFileInputFormat.<init>()
> >> >        at java.lang.Class.getConstructor0(Class.java:2706)
> >> >        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
> >> >        at
> >> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
> >> >        ... 5 more
> >> >
> >> >
> >>
> >
> >
> 


Re: How to write a custom input format and record reader to read multiple lines of text from files

Posted by Sean Owen <sr...@gmail.com>.
It sounds like you have declared a constructor in
MultiLineFileInputFormat that needs an argument. By doing so, no
no-arg constructor is automatically generated. Unless you write one,
it won't exist. The Hadoop framework instantiates your class by
calling the no-arg constructor. The error you get says this directly:
there is no no-arg constructor. Write one to fix it.

The example you reference has a no-arg constructor, by default, since
it declares no constructors at all.

On Tue, Dec 1, 2009 at 6:57 AM, Kunal Gupta <ku...@techlead-india.com> wrote:
> Can you kindly guide me on what initialisation i need to do in the
> implemented class constructor - MultiLineFileInputFormat?
>
> i was following the sample provided on this yahoo page:
>
> http://developer.yahoo.com/hadoop/tutorial/module5.html#fileformat
>
>
>
>
> On Tue, 2009-12-01 at 06:45 +0000, Sean Owen wrote:
>> It sounds like you have no provided a no-arg constructor in
>> MultiLineFileInputFormat.
>>
>> On Tue, Dec 1, 2009 at 6:17 AM, Kunal Gupta <ku...@techlead-india.com> wrote:
>> > Can someone explain how to override the "FileInputFormat" and
>> > "RecordReader" in order to be able to read multiple lines of text from
>> > input files in a single map task?
>> >
>> > Here the key will be the offset of the first line of text and value will
>> > be the N lines of text.
>> >
>> > I have overridden the class FileInputFormat:
>> >
>> > public class MultiLineFileInputFormat
>> >        extends FileInputFormat<LongWritable, Text>{
>> > ...
>> > }
>> >
>> > and implemented the abstract method:
>> >
>> > public RecordReader createRecordReader(InputSplit split,
>> >                TaskAttemptContext context)
>> >         throws IOException, InterruptedException {...}
>> >
>> > I have also overridden the recordreader class:
>> >
>> > public class MultiLineFileRecordReader extends
>> > RecordReader<LongWritable, Text>
>> > {...}
>> >
>> > and in the job configuration, specified this new InputFormat class:
>> >
>> > job.setInputFormatClass(MultiLineFileInputFormat.class);
>> >
>> > --------------------------------------------------------------------------
>> > When I  run this new map/reduce program, i get the following java error:
>> > --------------------------------------------------------------------------
>> > Exception in thread "main" java.lang.RuntimeException:
>> > java.lang.NoSuchMethodException: CustomRecordReader
>> > $MultiLineFileInputFormat.<init>()
>> >        at
>> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
>> >        at
>> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
>> >        at
>> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>> >        at CustomRecordReader.main(CustomRecordReader.java:257)
>> > Caused by: java.lang.NoSuchMethodException: CustomRecordReader
>> > $MultiLineFileInputFormat.<init>()
>> >        at java.lang.Class.getConstructor0(Class.java:2706)
>> >        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
>> >        at
>> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
>> >        ... 5 more
>> >
>> >
>>
>
>