You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Wojciech Langiewicz <wl...@gmail.com> on 2011/03/15 10:55:26 UTC
java.io.IOException: Split metadata size exceeded 10000000
Hello,
I'm having this problem running mapreduce jobs over about 10TB of data
(smaller jobs are ok):
2011-03-15 07:48:22,031 ERROR org.apache.hadoop.mapred.JobTracker: Job
initialization failed:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_201103141436_0058
at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
at
org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:732)
at
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:633)
at
org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3965)
at
org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
2011-03-15 07:48:22,031 INFO org.apache.hadoop.mapred.JobTracker:
Failing job job_201103141436_0058
What settings should I change to run this job?
I'm using CDH3b3.
Thanks for all answers.
--
Wojciech Langiewcz
RE: java.io.IOException: Split metadata size exceeded 10000000
Posted by "Rottinghuis, Joep" <jr...@ebay.com>.
Doubt this is a CDH3 issue.
We saw the same with a large job using the 0.20-security branch.
There is a property (mapreduce.jobtracker.split.metainfo.maxsize) that can be used to override the default of 10^6.
We found that passing this along with the job has no effect, this worked only when setting this property on the jobtracker node. Not sure if this is a feature or a bug.
Cheers,
Joep
-----Original Message-----
From: Harsh J [mailto:qwertymaniac@gmail.com]
Sent: Tuesday, March 15, 2011 3:33 AM
To: CDH Users
Cc: wlangiewicz@gmail.com
Subject: Re: java.io.IOException: Split metadata size exceeded 10000000
Moving this discussion to the CDH users list at cdh-user [at]
cloudera.org since it could be a CDH specific issue.
[Bcc: general]
On Tue, Mar 15, 2011 at 3:25 PM, Wojciech Langiewicz
<wl...@gmail.com> wrote:
> Hello,
> I'm having this problem running mapreduce jobs over about 10TB of data
> (smaller jobs are ok):
> 2011-03-15 07:48:22,031 ERROR org.apache.hadoop.mapred.JobTracker: Job
> initialization failed:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_201103141436_0058
> at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
> at
> org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:732)
> at
> org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:633)
> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3965)
> at
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
>
> 2011-03-15 07:48:22,031 INFO org.apache.hadoop.mapred.JobTracker: Failing
> job job_201103141436_0058
>
> What settings should I change to run this job?
> I'm using CDH3b3.
> Thanks for all answers.
>
> --
> Wojciech Langiewcz
>
--
Harsh J
http://harshj.com
Re: java.io.IOException: Split metadata size exceeded 10000000
Posted by Harsh J <qw...@gmail.com>.
Moving this discussion to the CDH users list at cdh-user [at]
cloudera.org since it could be a CDH specific issue.
[Bcc: general]
On Tue, Mar 15, 2011 at 3:25 PM, Wojciech Langiewicz
<wl...@gmail.com> wrote:
> Hello,
> I'm having this problem running mapreduce jobs over about 10TB of data
> (smaller jobs are ok):
> 2011-03-15 07:48:22,031 ERROR org.apache.hadoop.mapred.JobTracker: Job
> initialization failed:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_201103141436_0058
> at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
> at
> org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:732)
> at
> org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:633)
> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3965)
> at
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
>
> 2011-03-15 07:48:22,031 INFO org.apache.hadoop.mapred.JobTracker: Failing
> job job_201103141436_0058
>
> What settings should I change to run this job?
> I'm using CDH3b3.
> Thanks for all answers.
>
> --
> Wojciech Langiewcz
>
--
Harsh J
http://harshj.com