You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Kris Coward <kr...@melon.org> on 2011/01/31 19:29:48 UTC

Problems with STORE

So I have a relation apa which when DUMPed, ends up getting output just
fine, but when I run 
STORE apa INTO '/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa' USING PigStorage(',');
I get the following error:

java.io.IOException: Mkdirs failed to create
file:/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa/_temporary/_attempt_local_0007_m_000000_0
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:367)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:526)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:507)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:414)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:406)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:177)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:96)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:80)
    at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:624)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
2011-01-31 16:42:31,999 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Submitting job: job_local_0007 to execution engine.
2011-01-31 16:42:32,501 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2011-01-31 16:42:37,021 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2011-01-31 16:42:37,021 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
2011-01-31 16:42:37,021 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa"
2011-01-31 16:42:37,022 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

I've manually created the directory /rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa
from within grunt to verify that it wasn't a permissions problem (and
then removed apa so that STORE wouldn't fail on account of the directory
already existing), and the error persists.

Any advice on what might be causing this problem?

Thanks,
Kris

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: STORE

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

In what way are you gathering results?

Solutions typically involve a choice of:
* don't -- just read directories
* openIterator (which is the same thing, really)
* use a single reducer
* hadoop fs -cat /path/to/output/* > myoutput
* use HAR
* write your own

Pig (and hadoop) don't store your results in a single file because
that would force all reducers to coordinate writing their outputs;
this way they function completely independently.

-D


On Mon, Jan 31, 2011 at 4:13 PM,  <an...@nokia.com> wrote:
> Is there a way to tell pig in map red embedded mode
> to store all my results in single results file
> instead of all the parts that it creates
> that I have to merge afterwards
>
> if it is not possible then
> what is the recommended way to gather the results (using openIterator ?)
>
> thanks
> Anindita
>

STORE

Posted by an...@nokia.com.

Is there a way to tell pig in map red embedded mode 
to store all my results in single results file 
instead of all the parts that it creates 
that I have to merge afterwards

if it is not possible then 
what is the recommended way to gather the results (using openIterator ?)

thanks
Anindita

Re: Problems with STORE

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

The directory it's trying to create is on the local file system of a node
(it's temp storage), not in hdfs.
Do you have /rawfiles/ set up as temp storage for Hadoop?

-D

On Mon, Jan 31, 2011 at 10:29 AM, Kris Coward <kr...@melon.org> wrote:

>
> So I have a relation apa which when DUMPed, ends up getting output just
> fine, but when I run
> STORE apa INTO '/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa'
> USING PigStorage(',');
> I get the following error:
>
> java.io.IOException: Mkdirs failed to create
>
> file:/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa/_temporary/_attempt_local_0007_m_000000_0
>    at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:367)
>    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:526)
>    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:507)
>    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:414)
>    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:406)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:177)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:96)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:80)
>    at
> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:624)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> 2011-01-31 16:42:31,999 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Submitting job: job_local_0007 to execution engine.
> 2011-01-31 16:42:32,501 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2011-01-31 16:42:37,021 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
> 2011-01-31 16:42:37,021 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map reduce job(s) failed!
> 2011-01-31 16:42:37,021 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed to produce result in:
> "file:/rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa"
> 2011-01-31 16:42:37,022 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
>
> I've manually created the directory
> /rawfiles/f3453efd460348bbaeee2e9496e25871/1294311600/apa
> from within grunt to verify that it wasn't a permissions problem (and
> then removed apa so that STORE wouldn't fail on account of the directory
> already existing), and the error persists.
>
> Any advice on what might be causing this problem?
>
> Thanks,
> Kris
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>