You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Kris Coward <kr...@melon.org> on 2011/02/28 04:47:58 UTC

Problems loading a datafile..

So I finally got a couple of test scripts running on my cluster to take
a sample data file, load it, do a little processing, store it, load it,
do a little more processing, and dump the results.

Once these were working, I set to parsing and storing some real data,
but when got an "Unable to create input slice" error when trying to load
this data back out again. This happened with each of:

foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
foo = LOAD '/path/to/file/item/*/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
foo = LOAD '/path/to/file/item/ex/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);

and yielded the error (the same each time, except for the name/glob
used):

ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias foo
        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
        at org.apache.pig.Main.main(Main.java:352)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
        at org.apache.pig.PigServer.store(PigServer.java:529)
        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
        ... 6 more


Anyone have any suggestions why this may be happening and how to fix it?

Thanks,
Kris

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Yep, all you have to do is upgrade to Pig 8...
This sort of thing is one of the reasons Load/Store interfaces were
completely redesigned after Pig 0.6.

On Wed, Mar 2, 2011 at 8:46 PM, Kris Coward <kr...@melon.org> wrote:

>
> Yep. That did it. Now if you don't mind my asking, is there any way to
> direct LzoTokenizedStorage to put that extension on the part files when
> it's writing them in the first place?
>
> -K
>
> On Wed, Mar 02, 2011 at 03:17:09PM -0800, Dmitriy Ryaboy wrote:
> > Oh.
> > Yea we expect LZO files to have a .lzo extension.
> >
> > D
> >
> > On Wed, Mar 2, 2011 at 12:16 PM, Kris Coward <kr...@melon.org> wrote:
> >
> > >
> > > I might still be missing something useful (we're running elephant-bird
> > > from the gpl-packing distribution, and I've registered most of the
> > > jarfiles from it), but the strack trace has changed a little, so now
> > > it's producing:
> > >
> > > Backend error message during job submission
> > > -------------------------------------------
> > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Unable to
> > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
> > >        at
> > > org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
> > >        at
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
> > >        at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
> > >        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
> > >        at
> > >
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> > >        at
> > > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> > >        at java.lang.Thread.run(Thread.java:662)
> > > Caused by: org.apache.pig.PigException: ERROR 0: no files found a path
> > > hdfs://master.hadoop:9000/hadooptest/lzofile
> > >        at
> com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown
> > > Source)
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260)
> > >        ... 7 more
> > >
> > > Pig Stack Trace
> > > ---------------
> > > ERROR 2997: Unable to recreate exception from backend error:
> > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Unable to
> > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> > >
> > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> to
> > > open iterator for alias test4
> > >         at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > >        at
> > >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > >        at org.apache.pig.Main.main(Main.java:352)
> > > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > > 2997: Unable to recreate exception from backend error:
> > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Unable to
> > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> > >         at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > >        at
> > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > >        ... 6 more
> > >
> > >
> ================================================================================
> > >
> > > The "ERROR 0: no files found a path
> > > hdfs://master.hadoop:9000/hadooptest/lzofile"
> > > message has me really puzzled because in grunt I can see the files, I
> > > can copy them to local, I can rename them with .lzo on the end,
> > > uncompress them, and see the data that I expect, and I can even load
> > > them with PigLoader (though obviously the data's all wrong when I do
> > > that).
> > >
> > > Any more tips?
> > >
> > > Thanks,
> > > Kris
> > >
> > > On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote:
> > > > Off the top of my head, I can't think of anything, but you can just
> grab
> > > > everything in Elephant-Bird's lib/ directory and make sure it's on
> the
> > > > classpath on all the task trackers and your client machine (you can
> > > > propagate it to the TTs via the register keyword if you don't want to
> bug
> > > > your hadoop sysadmin and restart things).
> > > >
> > > > D
> > > >
> > > > On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <kr...@melon.org> wrote:
> > > >
> > > > >
> > > > > Nope; they're reproduced across all the machines. Does the
> > > > > LzoTokenizedLoader class have any dependencies that
> LzoTokenizedStorage
> > > > > doesn't (which I may be overlooking)?
> > > > >
> > > > > -K
> > > > >
> > > > > On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> > > > > >
> > > > > > What's peculiar is that the test script for the loader class that
> was
> > > > > > run a week ago seems also to be failing with the same error.
> We've
> > > added
> > > > > > nodes to the cluster; maybe the relevant .jar files haven't been
> > > copied
> > > > > > over to those nodes. I'll bug our sysadmin about that..
> > > > > >
> > > > > > Thanks,
> > > > > > Kris
> > > > > >
> > > > > > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > > > > > > Kris,
> > > > > > > Check the pig log file. Often "unable to create input slice" is
> > > caused
> > > > > by
> > > > > > > errors such as not being able to find your loader class, or
> some
> > > > > dependency
> > > > > > > of your loader class.
> > > > > > >
> > > > > > > D
> > > > > > >
> > > > > > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org>
> > > wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > I get the output:
> > > > > > > >
> > > > > > > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > > > > > > /path/to/file/item/ex/subdir
> > > > > > > >
> > > > > > > > -K
> > > > > > > >
> > > > > > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy
> wrote:
> > > > > > > > > What happens when you "hadoop fs -lsr" those paths?
> > > > > > > > >
> > > > > > > > > D
> > > > > > > > >
> > > > > > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <
> kris@melon.org>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So I finally got a couple of test scripts running on my
> > > cluster
> > > > > to take
> > > > > > > > > > a sample data file, load it, do a little processing,
> store
> > > it,
> > > > > load it,
> > > > > > > > > > do a little more processing, and dump the results.
> > > > > > > > > >
> > > > > > > > > > Once these were working, I set to parsing and storing
> some
> > > real
> > > > > data,
> > > > > > > > > > but when got an "Unable to create input slice" error when
> > > trying
> > > > > to
> > > > > > > > load
> > > > > > > > > > this data back out again. This happened with each of:
> > > > > > > > > >
> > > > > > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir'
> USING
> > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',')
> AS
> > > > > > > > (schema:...);
> > > > > > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',')
> AS
> > > > > > > > (schema:...);
> > > > > > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',')
> AS
> > > > > > > > (schema:...);
> > > > > > > > > >
> > > > > > > > > > and yielded the error (the same each time, except for the
> > > > > name/glob
> > > > > > > > > > used):
> > > > > > > > > >
> > > > > > > > > > ERROR 2997: Unable to recreate exception from backend
> error:
> > > > > > > > > > org.apache.pig.backend.executionengine.ExecException:
> ERROR
> > > 2118:
> > > > > > > > Unable to
> > > > > > > > > > create input slice for:
> > > > > > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR
> > > 1066:
> > > > > Unable
> > > > > > > > to
> > > > > > > > > > open iterator for alias foo
> > > > > > > > > >        at
> > > > > org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > > > > > > >        at
> org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > > > > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > > > > > > Caused by:
> > > org.apache.pig.backend.executionengine.ExecException:
> > > > > ERROR
> > > > > > > > > > 2997: Unable to recreate exception from backend error:
> > > > > > > > > > org.apache.pig.backend.executionengine.ExecException:
> ERROR
> > > 2118:
> > > > > > > > Unable to
> > > > > > > > > > create input slice for:
> > > > > > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > >
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > > > > > > >        at
> org.apache.pig.PigServer.store(PigServer.java:529)
> > > > > > > > > >        at
> > > > > org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > > > > > > >        ... 6 more
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Anyone have any suggestions why this may be happening and
> how
> > > to
> > > > > fix
> > > > > > > > it?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Kris
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Kris Coward
> > > > > > > > http://unripe.melon.org/
> > > > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7
> > > 1FEB
> > > > > 12B3
> > > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Kris Coward
> > > > > http://unripe.melon.org/
> > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7
> 1FEB
> > > 12B3
> > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Kris Coward
> > > http://unripe.melon.org/
> > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB
> 12B3
> > > > >
> > > > > --
> > > > > Kris Coward
> > > http://unripe.melon.org/
> > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > > >
> > >
> > > --
> > > Kris Coward
> http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
Yep. That did it. Now if you don't mind my asking, is there any way to
direct LzoTokenizedStorage to put that extension on the part files when
it's writing them in the first place?

-K

On Wed, Mar 02, 2011 at 03:17:09PM -0800, Dmitriy Ryaboy wrote:
> Oh.
> Yea we expect LZO files to have a .lzo extension.
> 
> D
> 
> On Wed, Mar 2, 2011 at 12:16 PM, Kris Coward <kr...@melon.org> wrote:
> 
> >
> > I might still be missing something useful (we're running elephant-bird
> > from the gpl-packing distribution, and I've registered most of the
> > jarfiles from it), but the strack trace has changed a little, so now
> > it's producing:
> >
> > Backend error message during job submission
> > -------------------------------------------
> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
> >        at
> > org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
> >        at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
> >        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
> >        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
> >        at
> > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> >        at
> > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> >        at java.lang.Thread.run(Thread.java:662)
> > Caused by: org.apache.pig.PigException: ERROR 0: no files found a path
> > hdfs://master.hadoop:9000/hadooptest/lzofile
> >        at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown
> > Source)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260)
> >        ... 7 more
> >
> > Pig Stack Trace
> > ---------------
> > ERROR 2997: Unable to recreate exception from backend error:
> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> >
> > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> > open iterator for alias test4
> >         at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> >        at
> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> >        at org.apache.pig.Main.main(Main.java:352)
> > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > 2997: Unable to recreate exception from backend error:
> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
> >         at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> >        at
> > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> >        at org.apache.pig.PigServer.store(PigServer.java:529)
> >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> >        ... 6 more
> >
> > ================================================================================
> >
> > The "ERROR 0: no files found a path
> > hdfs://master.hadoop:9000/hadooptest/lzofile"
> > message has me really puzzled because in grunt I can see the files, I
> > can copy them to local, I can rename them with .lzo on the end,
> > uncompress them, and see the data that I expect, and I can even load
> > them with PigLoader (though obviously the data's all wrong when I do
> > that).
> >
> > Any more tips?
> >
> > Thanks,
> > Kris
> >
> > On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote:
> > > Off the top of my head, I can't think of anything, but you can just grab
> > > everything in Elephant-Bird's lib/ directory and make sure it's on the
> > > classpath on all the task trackers and your client machine (you can
> > > propagate it to the TTs via the register keyword if you don't want to bug
> > > your hadoop sysadmin and restart things).
> > >
> > > D
> > >
> > > On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <kr...@melon.org> wrote:
> > >
> > > >
> > > > Nope; they're reproduced across all the machines. Does the
> > > > LzoTokenizedLoader class have any dependencies that LzoTokenizedStorage
> > > > doesn't (which I may be overlooking)?
> > > >
> > > > -K
> > > >
> > > > On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> > > > >
> > > > > What's peculiar is that the test script for the loader class that was
> > > > > run a week ago seems also to be failing with the same error. We've
> > added
> > > > > nodes to the cluster; maybe the relevant .jar files haven't been
> > copied
> > > > > over to those nodes. I'll bug our sysadmin about that..
> > > > >
> > > > > Thanks,
> > > > > Kris
> > > > >
> > > > > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > > > > > Kris,
> > > > > > Check the pig log file. Often "unable to create input slice" is
> > caused
> > > > by
> > > > > > errors such as not being able to find your loader class, or some
> > > > dependency
> > > > > > of your loader class.
> > > > > >
> > > > > > D
> > > > > >
> > > > > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org>
> > wrote:
> > > > > >
> > > > > > >
> > > > > > > I get the output:
> > > > > > >
> > > > > > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > > > > > /path/to/file/item/ex/subdir
> > > > > > >
> > > > > > > -K
> > > > > > >
> > > > > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > > > > > > What happens when you "hadoop fs -lsr" those paths?
> > > > > > > >
> > > > > > > > D
> > > > > > > >
> > > > > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org>
> > > > wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > So I finally got a couple of test scripts running on my
> > cluster
> > > > to take
> > > > > > > > > a sample data file, load it, do a little processing, store
> > it,
> > > > load it,
> > > > > > > > > do a little more processing, and dump the results.
> > > > > > > > >
> > > > > > > > > Once these were working, I set to parsing and storing some
> > real
> > > > data,
> > > > > > > > > but when got an "Unable to create input slice" error when
> > trying
> > > > to
> > > > > > > load
> > > > > > > > > this data back out again. This happened with each of:
> > > > > > > > >
> > > > > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > > (schema:...);
> > > > > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > > (schema:...);
> > > > > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > > (schema:...);
> > > > > > > > >
> > > > > > > > > and yielded the error (the same each time, except for the
> > > > name/glob
> > > > > > > > > used):
> > > > > > > > >
> > > > > > > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR
> > 2118:
> > > > > > > Unable to
> > > > > > > > > create input slice for:
> > > > > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR
> > 1066:
> > > > Unable
> > > > > > > to
> > > > > > > > > open iterator for alias foo
> > > > > > > > >        at
> > > > org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > > > > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > > > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > > > > > Caused by:
> > org.apache.pig.backend.executionengine.ExecException:
> > > > ERROR
> > > > > > > > > 2997: Unable to recreate exception from backend error:
> > > > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR
> > 2118:
> > > > > > > Unable to
> > > > > > > > > create input slice for:
> > > > > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > >
> > > >
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > > > > > >        at
> > > > > > > > >
> > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > > > > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > > > > > > >        at
> > > > org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > > > > > >        ... 6 more
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Anyone have any suggestions why this may be happening and how
> > to
> > > > fix
> > > > > > > it?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Kris
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Kris Coward
> > > > > > > http://unripe.melon.org/
> > > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7
> > 1FEB
> > > > 12B3
> > > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Kris Coward
> > > > http://unripe.melon.org/
> > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB
> > 12B3
> > > > > > >
> > > > >
> > > > > --
> > > > > Kris Coward
> > http://unripe.melon.org/
> > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > >
> > > > --
> > > > Kris Coward
> > http://unripe.melon.org/
> > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > >
> >
> > --
> > Kris Coward                                     http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Oh.
Yea we expect LZO files to have a .lzo extension.

D

On Wed, Mar 2, 2011 at 12:16 PM, Kris Coward <kr...@melon.org> wrote:

>
> I might still be missing something useful (we're running elephant-bird
> from the gpl-packing distribution, and I've registered most of the
> jarfiles from it), but the strack trace has changed a little, so now
> it's producing:
>
> Backend error message during job submission
> -------------------------------------------
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
>        at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
>        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>        at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>        at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.pig.PigException: ERROR 0: no files found a path
> hdfs://master.hadoop:9000/hadooptest/lzofile
>        at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown
> Source)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260)
>        ... 7 more
>
> Pig Stack Trace
> ---------------
> ERROR 2997: Unable to recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> open iterator for alias test4
>         at org.apache.pig.PigServer.openIterator(PigServer.java:482)
>        at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
>        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
>        at org.apache.pig.Main.main(Main.java:352)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2997: Unable to recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
>        at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
>        at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
>        at org.apache.pig.PigServer.store(PigServer.java:529)
>        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
>        ... 6 more
>
> ================================================================================
>
> The "ERROR 0: no files found a path
> hdfs://master.hadoop:9000/hadooptest/lzofile"
> message has me really puzzled because in grunt I can see the files, I
> can copy them to local, I can rename them with .lzo on the end,
> uncompress them, and see the data that I expect, and I can even load
> them with PigLoader (though obviously the data's all wrong when I do
> that).
>
> Any more tips?
>
> Thanks,
> Kris
>
> On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote:
> > Off the top of my head, I can't think of anything, but you can just grab
> > everything in Elephant-Bird's lib/ directory and make sure it's on the
> > classpath on all the task trackers and your client machine (you can
> > propagate it to the TTs via the register keyword if you don't want to bug
> > your hadoop sysadmin and restart things).
> >
> > D
> >
> > On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <kr...@melon.org> wrote:
> >
> > >
> > > Nope; they're reproduced across all the machines. Does the
> > > LzoTokenizedLoader class have any dependencies that LzoTokenizedStorage
> > > doesn't (which I may be overlooking)?
> > >
> > > -K
> > >
> > > On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> > > >
> > > > What's peculiar is that the test script for the loader class that was
> > > > run a week ago seems also to be failing with the same error. We've
> added
> > > > nodes to the cluster; maybe the relevant .jar files haven't been
> copied
> > > > over to those nodes. I'll bug our sysadmin about that..
> > > >
> > > > Thanks,
> > > > Kris
> > > >
> > > > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > > > > Kris,
> > > > > Check the pig log file. Often "unable to create input slice" is
> caused
> > > by
> > > > > errors such as not being able to find your loader class, or some
> > > dependency
> > > > > of your loader class.
> > > > >
> > > > > D
> > > > >
> > > > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org>
> wrote:
> > > > >
> > > > > >
> > > > > > I get the output:
> > > > > >
> > > > > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > > > > /path/to/file/item/ex/subdir
> > > > > >
> > > > > > -K
> > > > > >
> > > > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > > > > > What happens when you "hadoop fs -lsr" those paths?
> > > > > > >
> > > > > > > D
> > > > > > >
> > > > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org>
> > > wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > So I finally got a couple of test scripts running on my
> cluster
> > > to take
> > > > > > > > a sample data file, load it, do a little processing, store
> it,
> > > load it,
> > > > > > > > do a little more processing, and dump the results.
> > > > > > > >
> > > > > > > > Once these were working, I set to parsing and storing some
> real
> > > data,
> > > > > > > > but when got an "Unable to create input slice" error when
> trying
> > > to
> > > > > > load
> > > > > > > > this data back out again. This happened with each of:
> > > > > > > >
> > > > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > (schema:...);
> > > > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > (schema:...);
> > > > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > > (schema:...);
> > > > > > > >
> > > > > > > > and yielded the error (the same each time, except for the
> > > name/glob
> > > > > > > > used):
> > > > > > > >
> > > > > > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR
> 2118:
> > > > > > Unable to
> > > > > > > > create input slice for:
> > > > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR
> 1066:
> > > Unable
> > > > > > to
> > > > > > > > open iterator for alias foo
> > > > > > > >        at
> > > org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > > > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > > > > Caused by:
> org.apache.pig.backend.executionengine.ExecException:
> > > ERROR
> > > > > > > > 2997: Unable to recreate exception from backend error:
> > > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR
> 2118:
> > > > > > Unable to
> > > > > > > > create input slice for:
> > > > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > > > > >        at
> > > > > > > >
> > > > > >
> > >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > > > > >        at
> > > > > > > >
> > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > > > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > > > > > >        at
> > > org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > > > > >        ... 6 more
> > > > > > > >
> > > > > > > >
> > > > > > > > Anyone have any suggestions why this may be happening and how
> to
> > > fix
> > > > > > it?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Kris
> > > > > > > >
> > > > > > > > --
> > > > > > > > Kris Coward
> > > > > > http://unripe.melon.org/
> > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7
> 1FEB
> > > 12B3
> > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Kris Coward
> > > http://unripe.melon.org/
> > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB
> 12B3
> > > > > >
> > > >
> > > > --
> > > > Kris Coward
> http://unripe.melon.org/
> > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
> > > --
> > > Kris Coward
> http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
I might still be missing something useful (we're running elephant-bird
from the gpl-packing distribution, and I've registered most of the
jarfiles from it), but the strack trace has changed a little, so now
it's producing:

Backend error message during job submission
-------------------------------------------
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.pig.PigException: ERROR 0: no files found a path hdfs://master.hadoop:9000/hadooptest/lzofile
        at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown Source)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260)
        ... 7 more

Pig Stack Trace
---------------
ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias test4
        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
        at org.apache.pig.Main.main(Main.java:352)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
        at org.apache.pig.PigServer.store(PigServer.java:529)
        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
        ... 6 more
================================================================================

The "ERROR 0: no files found a path hdfs://master.hadoop:9000/hadooptest/lzofile"
message has me really puzzled because in grunt I can see the files, I
can copy them to local, I can rename them with .lzo on the end,
uncompress them, and see the data that I expect, and I can even load
them with PigLoader (though obviously the data's all wrong when I do
that).

Any more tips?

Thanks,
Kris

On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote:
> Off the top of my head, I can't think of anything, but you can just grab
> everything in Elephant-Bird's lib/ directory and make sure it's on the
> classpath on all the task trackers and your client machine (you can
> propagate it to the TTs via the register keyword if you don't want to bug
> your hadoop sysadmin and restart things).
> 
> D
> 
> On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <kr...@melon.org> wrote:
> 
> >
> > Nope; they're reproduced across all the machines. Does the
> > LzoTokenizedLoader class have any dependencies that LzoTokenizedStorage
> > doesn't (which I may be overlooking)?
> >
> > -K
> >
> > On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> > >
> > > What's peculiar is that the test script for the loader class that was
> > > run a week ago seems also to be failing with the same error. We've added
> > > nodes to the cluster; maybe the relevant .jar files haven't been copied
> > > over to those nodes. I'll bug our sysadmin about that..
> > >
> > > Thanks,
> > > Kris
> > >
> > > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > > > Kris,
> > > > Check the pig log file. Often "unable to create input slice" is caused
> > by
> > > > errors such as not being able to find your loader class, or some
> > dependency
> > > > of your loader class.
> > > >
> > > > D
> > > >
> > > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org> wrote:
> > > >
> > > > >
> > > > > I get the output:
> > > > >
> > > > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > > > /path/to/file/item/ex/subdir
> > > > >
> > > > > -K
> > > > >
> > > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > > > > What happens when you "hadoop fs -lsr" those paths?
> > > > > >
> > > > > > D
> > > > > >
> > > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org>
> > wrote:
> > > > > >
> > > > > > >
> > > > > > > So I finally got a couple of test scripts running on my cluster
> > to take
> > > > > > > a sample data file, load it, do a little processing, store it,
> > load it,
> > > > > > > do a little more processing, and dump the results.
> > > > > > >
> > > > > > > Once these were working, I set to parsing and storing some real
> > data,
> > > > > > > but when got an "Unable to create input slice" error when trying
> > to
> > > > > load
> > > > > > > this data back out again. This happened with each of:
> > > > > > >
> > > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > (schema:...);
> > > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > (schema:...);
> > > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > > (schema:...);
> > > > > > >
> > > > > > > and yielded the error (the same each time, except for the
> > name/glob
> > > > > > > used):
> > > > > > >
> > > > > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > > > Unable to
> > > > > > > create input slice for:
> > > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
> > Unable
> > > > > to
> > > > > > > open iterator for alias foo
> > > > > > >        at
> > org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > > > Caused by: org.apache.pig.backend.executionengine.ExecException:
> > ERROR
> > > > > > > 2997: Unable to recreate exception from backend error:
> > > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > > > Unable to
> > > > > > > create input slice for:
> > > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > > > >        at
> > > > > > >
> > > > >
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > > > >        at
> > > > > > >
> > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > > > > >        at
> > org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > > > >        ... 6 more
> > > > > > >
> > > > > > >
> > > > > > > Anyone have any suggestions why this may be happening and how to
> > fix
> > > > > it?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Kris
> > > > > > >
> > > > > > > --
> > > > > > > Kris Coward
> > > > > http://unripe.melon.org/
> > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB
> > 12B3
> > > > > > >
> > > > >
> > > > > --
> > > > > Kris Coward
> > http://unripe.melon.org/
> > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > > >
> > >
> > > --
> > > Kris Coward                                   http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >
> > --
> > Kris Coward                                     http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Off the top of my head, I can't think of anything, but you can just grab
everything in Elephant-Bird's lib/ directory and make sure it's on the
classpath on all the task trackers and your client machine (you can
propagate it to the TTs via the register keyword if you don't want to bug
your hadoop sysadmin and restart things).

D

On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <kr...@melon.org> wrote:

>
> Nope; they're reproduced across all the machines. Does the
> LzoTokenizedLoader class have any dependencies that LzoTokenizedStorage
> doesn't (which I may be overlooking)?
>
> -K
>
> On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> >
> > What's peculiar is that the test script for the loader class that was
> > run a week ago seems also to be failing with the same error. We've added
> > nodes to the cluster; maybe the relevant .jar files haven't been copied
> > over to those nodes. I'll bug our sysadmin about that..
> >
> > Thanks,
> > Kris
> >
> > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > > Kris,
> > > Check the pig log file. Often "unable to create input slice" is caused
> by
> > > errors such as not being able to find your loader class, or some
> dependency
> > > of your loader class.
> > >
> > > D
> > >
> > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org> wrote:
> > >
> > > >
> > > > I get the output:
> > > >
> > > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > > /path/to/file/item/ex/subdir
> > > >
> > > > -K
> > > >
> > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > > > What happens when you "hadoop fs -lsr" those paths?
> > > > >
> > > > > D
> > > > >
> > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org>
> wrote:
> > > > >
> > > > > >
> > > > > > So I finally got a couple of test scripts running on my cluster
> to take
> > > > > > a sample data file, load it, do a little processing, store it,
> load it,
> > > > > > do a little more processing, and dump the results.
> > > > > >
> > > > > > Once these were working, I set to parsing and storing some real
> data,
> > > > > > but when got an "Unable to create input slice" error when trying
> to
> > > > load
> > > > > > this data back out again. This happened with each of:
> > > > > >
> > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > (schema:...);
> > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > (schema:...);
> > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > > (schema:...);
> > > > > >
> > > > > > and yielded the error (the same each time, except for the
> name/glob
> > > > > > used):
> > > > > >
> > > > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > > Unable to
> > > > > > create input slice for:
> > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
> Unable
> > > > to
> > > > > > open iterator for alias foo
> > > > > >        at
> org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > > Caused by: org.apache.pig.backend.executionengine.ExecException:
> ERROR
> > > > > > 2997: Unable to recreate exception from backend error:
> > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > > Unable to
> > > > > > create input slice for:
> > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > > >        at
> > > > > >
> > > >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > > >        at
> > > > > >
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > > > >        at
> org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > > >        ... 6 more
> > > > > >
> > > > > >
> > > > > > Anyone have any suggestions why this may be happening and how to
> fix
> > > > it?
> > > > > >
> > > > > > Thanks,
> > > > > > Kris
> > > > > >
> > > > > > --
> > > > > > Kris Coward
> > > > http://unripe.melon.org/
> > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB
> 12B3
> > > > > >
> > > >
> > > > --
> > > > Kris Coward
> http://unripe.melon.org/
> > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > >
> >
> > --
> > Kris Coward                                   http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
Nope; they're reproduced across all the machines. Does the
LzoTokenizedLoader class have any dependencies that LzoTokenizedStorage
doesn't (which I may be overlooking)?

-K

On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote:
> 
> What's peculiar is that the test script for the loader class that was
> run a week ago seems also to be failing with the same error. We've added
> nodes to the cluster; maybe the relevant .jar files haven't been copied
> over to those nodes. I'll bug our sysadmin about that..
> 
> Thanks,
> Kris
> 
> On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> > Kris,
> > Check the pig log file. Often "unable to create input slice" is caused by
> > errors such as not being able to find your loader class, or some dependency
> > of your loader class.
> > 
> > D
> > 
> > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org> wrote:
> > 
> > >
> > > I get the output:
> > >
> > > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > > /path/to/file/item/ex/subdir
> > >
> > > -K
> > >
> > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > > What happens when you "hadoop fs -lsr" those paths?
> > > >
> > > > D
> > > >
> > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org> wrote:
> > > >
> > > > >
> > > > > So I finally got a couple of test scripts running on my cluster to take
> > > > > a sample data file, load it, do a little processing, store it, load it,
> > > > > do a little more processing, and dump the results.
> > > > >
> > > > > Once these were working, I set to parsing and storing some real data,
> > > > > but when got an "Unable to create input slice" error when trying to
> > > load
> > > > > this data back out again. This happened with each of:
> > > > >
> > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > (schema:...);
> > > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > (schema:...);
> > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > > (schema:...);
> > > > >
> > > > > and yielded the error (the same each time, except for the name/glob
> > > > > used):
> > > > >
> > > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > Unable to
> > > > > create input slice for:
> > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> > > to
> > > > > open iterator for alias foo
> > > > >        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > > >        at
> > > > >
> > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > > >        at
> > > > >
> > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > > >        at
> > > > >
> > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > > >        at
> > > > >
> > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > > > > 2997: Unable to recreate exception from backend error:
> > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > > Unable to
> > > > > create input slice for:
> > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > > >        at
> > > > >
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > > >        at
> > > > >
> > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > > >        at
> > > > >
> > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > > >        at
> > > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > > >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > > >        ... 6 more
> > > > >
> > > > >
> > > > > Anyone have any suggestions why this may be happening and how to fix
> > > it?
> > > > >
> > > > > Thanks,
> > > > > Kris
> > > > >
> > > > > --
> > > > > Kris Coward
> > > http://unripe.melon.org/
> > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > > >
> > >
> > > --
> > > Kris Coward                                     http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
> 
> -- 
> Kris Coward					http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
What's peculiar is that the test script for the loader class that was
run a week ago seems also to be failing with the same error. We've added
nodes to the cluster; maybe the relevant .jar files haven't been copied
over to those nodes. I'll bug our sysadmin about that..

Thanks,
Kris

On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote:
> Kris,
> Check the pig log file. Often "unable to create input slice" is caused by
> errors such as not being able to find your loader class, or some dependency
> of your loader class.
> 
> D
> 
> On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org> wrote:
> 
> >
> > I get the output:
> >
> > rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> > /path/to/file/item/ex/subdir
> >
> > -K
> >
> > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > > What happens when you "hadoop fs -lsr" those paths?
> > >
> > > D
> > >
> > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org> wrote:
> > >
> > > >
> > > > So I finally got a couple of test scripts running on my cluster to take
> > > > a sample data file, load it, do a little processing, store it, load it,
> > > > do a little more processing, and dump the results.
> > > >
> > > > Once these were working, I set to parsing and storing some real data,
> > > > but when got an "Unable to create input slice" error when trying to
> > load
> > > > this data back out again. This happened with each of:
> > > >
> > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > (schema:...);
> > > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > (schema:...);
> > > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> > (schema:...);
> > > >
> > > > and yielded the error (the same each time, except for the name/glob
> > > > used):
> > > >
> > > > ERROR 2997: Unable to recreate exception from backend error:
> > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > Unable to
> > > > create input slice for:
> > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> > to
> > > > open iterator for alias foo
> > > >        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > > >        at
> > > >
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > > >        at
> > > >
> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > > >        at
> > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > > >        at
> > > >
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > > >        at org.apache.pig.Main.main(Main.java:352)
> > > > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > > > 2997: Unable to recreate exception from backend error:
> > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> > Unable to
> > > > create input slice for:
> > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > > >        at
> > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > > >        at
> > > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > > >        at
> > > >
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > > >        at
> > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > > >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > > >        ... 6 more
> > > >
> > > >
> > > > Anyone have any suggestions why this may be happening and how to fix
> > it?
> > > >
> > > > Thanks,
> > > > Kris
> > > >
> > > > --
> > > > Kris Coward
> > http://unripe.melon.org/
> > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > > >
> >
> > --
> > Kris Coward                                     http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Kris,
Check the pig log file. Often "unable to create input slice" is caused by
errors such as not being able to find your loader class, or some dependency
of your loader class.

D

On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <kr...@melon.org> wrote:

>
> I get the output:
>
> rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59
> /path/to/file/item/ex/subdir
>
> -K
>
> On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> > What happens when you "hadoop fs -lsr" those paths?
> >
> > D
> >
> > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org> wrote:
> >
> > >
> > > So I finally got a couple of test scripts running on my cluster to take
> > > a sample data file, load it, do a little processing, store it, load it,
> > > do a little more processing, and dump the results.
> > >
> > > Once these were working, I set to parsing and storing some real data,
> > > but when got an "Unable to create input slice" error when trying to
> load
> > > this data back out again. This happened with each of:
> > >
> > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> (schema:...);
> > > foo = LOAD '/path/to/file/item/*/subdir' USING
> > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> (schema:...);
> > > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS
> (schema:...);
> > >
> > > and yielded the error (the same each time, except for the name/glob
> > > used):
> > >
> > > ERROR 2997: Unable to recreate exception from backend error:
> > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Unable to
> > > create input slice for:
> > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> to
> > > open iterator for alias foo
> > >        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> > >        at
> > >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > >        at
> > >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> > >        at org.apache.pig.Main.main(Main.java:352)
> > > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > > 2997: Unable to recreate exception from backend error:
> > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Unable to
> > > create input slice for:
> > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> > >        at
> > >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> > >        at
> > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> > >        at org.apache.pig.PigServer.store(PigServer.java:529)
> > >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> > >        ... 6 more
> > >
> > >
> > > Anyone have any suggestions why this may be happening and how to fix
> it?
> > >
> > > Thanks,
> > > Kris
> > >
> > > --
> > > Kris Coward
> http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
I get the output:

rw-r--r--   2 kris supergroup     172694 2011-02-25 01:59 /path/to/file/item/ex/subdir

-K

On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy wrote:
> What happens when you "hadoop fs -lsr" those paths?
> 
> D
> 
> On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org> wrote:
> 
> >
> > So I finally got a couple of test scripts running on my cluster to take
> > a sample data file, load it, do a little processing, store it, load it,
> > do a little more processing, and dump the results.
> >
> > Once these were working, I set to parsing and storing some real data,
> > but when got an "Unable to create input slice" error when trying to load
> > this data back out again. This happened with each of:
> >
> > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> > foo = LOAD '/path/to/file/item/*/subdir' USING
> > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> > foo = LOAD '/path/to/file/item/ex/subdir' USING
> > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> >
> > and yielded the error (the same each time, except for the name/glob
> > used):
> >
> > ERROR 2997: Unable to recreate exception from backend error:
> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> > create input slice for:
> > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> > open iterator for alias foo
> >        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
> >        at
> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> >        at
> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> >        at org.apache.pig.Main.main(Main.java:352)
> > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> > 2997: Unable to recreate exception from backend error:
> > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> > create input slice for:
> > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
> >        at
> > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
> >        at org.apache.pig.PigServer.store(PigServer.java:529)
> >        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
> >        ... 6 more
> >
> >
> > Anyone have any suggestions why this may be happening and how to fix it?
> >
> > Thanks,
> > Kris
> >
> > --
> > Kris Coward                                     http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Problems loading a datafile..

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
What happens when you "hadoop fs -lsr" those paths?

D

On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward <kr...@melon.org> wrote:

>
> So I finally got a couple of test scripts running on my cluster to take
> a sample data file, load it, do a little processing, store it, load it,
> do a little more processing, and dump the results.
>
> Once these were working, I set to parsing and storing some real data,
> but when got an "Unable to create input slice" error when trying to load
> this data back out again. This happened with each of:
>
> foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING
> com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> foo = LOAD '/path/to/file/item/*/subdir' USING
> com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> foo = LOAD '/path/to/file/item/ex/subdir' USING
> com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
>
> and yielded the error (the same each time, except for the name/glob
> used):
>
> ERROR 2997: Unable to recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> create input slice for:
> hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> open iterator for alias foo
>        at org.apache.pig.PigServer.openIterator(PigServer.java:482)
>        at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
>        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
>        at org.apache.pig.Main.main(Main.java:352)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2997: Unable to recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
> create input slice for:
> hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
>        at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
>        at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
>        at org.apache.pig.PigServer.store(PigServer.java:529)
>        at org.apache.pig.PigServer.openIterator(PigServer.java:465)
>        ... 6 more
>
>
> Anyone have any suggestions why this may be happening and how to fix it?
>
> Thanks,
> Kris
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Re: Problems loading a datafile..

Posted by Kris Coward <kr...@melon.org>.
Oh, and I also get the same error if I omit the schema >:(

-K

On Sun, Feb 27, 2011 at 10:47:58PM -0500, Kris Coward wrote:
> 
> So I finally got a couple of test scripts running on my cluster to take
> a sample data file, load it, do a little processing, store it, load it,
> do a little more processing, and dump the results.
> 
> Once these were working, I set to parsing and storing some real data,
> but when got an "Unable to create input slice" error when trying to load
> this data back out again. This happened with each of:
> 
> foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> foo = LOAD '/path/to/file/item/*/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> foo = LOAD '/path/to/file/item/ex/subdir' USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS (schema:...);
> 
> and yielded the error (the same each time, except for the name/glob
> used):
> 
> ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000//path/to/file/item/ex/subdir
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias foo
>         at org.apache.pig.PigServer.openIterator(PigServer.java:482)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
>         at org.apache.pig.Main.main(Main.java:352)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hdfs://master.hadoop:9000/path/to/file/item/ex/subdir
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249)
>         at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
>         at org.apache.pig.PigServer.store(PigServer.java:529)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:465)
>         ... 6 more
> 
> 
> Anyone have any suggestions why this may be happening and how to fix it?
> 
> Thanks,
> Kris
> 
> -- 
> Kris Coward					http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3