You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Viral K <kh...@yahoo-inc.com> on 2009/06/18 02:57:47 UTC

Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

Does anybody have any updates on this?

How can we have our own RecordReader in Hadoop pipes?  When I try to print
the "context.getInputSplit", I get the filenames along with some junk
characters.  As a result the file open fails.

Anybody got it working?

Viral.



11 Nov. wrote:
> 
> I traced into the c++ recordreader code:
>   WordCountReader(HadoopPipes::MapContext& context) {
>     std::string filename;
>     HadoopUtils::StringInStream stream(context.getInputSplit());
>     HadoopUtils::deserializeString(filename, stream);
>     struct stat statResult;
>     stat(filename.c_str(), &statResult);
>     bytesTotal = statResult.st_size;
>     bytesRead = 0;
>     cout << filename<<endl;
>     file = fopen(filename.c_str(), "rt");
>     HADOOP_ASSERT(file != NULL, "failed to open " + filename);
>   }
> 
> I got nothing for the filename virable, which showed the InputSplit is
> empty.
> 
> 2008/3/4, 11 Nov. <no...@gmail.com>:
>>
>> hi colleagues,
>>    I have set up the single node cluster to test pipes examples.
>>    wordcount-simple and wordcount-part work just fine. but
>> wordcount-nopipe can't run. Here is my commnad line:
>>
>>  bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input
>> input/ -output out-dir-nopipe1
>>
>> and here is the error message printed on my console:
>>
>> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set.  User
>> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
>> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to
>> process : 1
>> 08/03/03 23:23:07 INFO mapred.JobClient: Running job:
>> job_200803032218_0004
>> 08/03/03 23:23:08 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id :
>> task_200803032218_0004_m_000000_0, Status : FAILED
>> java.io.IOException: pipe child exception
>>         at org.apache.hadoop.mapred.pipes.Application.abort(
>> Application.java:138)
>>         at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(
>> PipesMapRunner.java:83)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
>> TaskTracker.java:1787)
>> Caused by: java.io.EOFException
>>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>         at
>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java
>> :313)
>>         at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java
>> :335)
>>         at
>> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(
>> BinaryProtocol.java:112)
>>
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to open
>> at /home/hadoop/hadoop-0.15.2-single-cluster
>> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in
>> WordCountReader::WordCountReader(HadoopPipes::MapContext&)
>>
>>
>> Could anybody tell me how to fix this? That will be appreciated.
>> Thanks a lot!
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Pipes-example-wordcount-nopipe.cc-failed-when-reading-from-input-splits-tp15807856p24084734.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

Posted by Jianmin Woo <ji...@yahoo.com>.

Hi, Roshan ,
Thanks a lot for your information about the InputSplit between Java and pipes.

-Jianmin




________________________________
From: Roshan James <ro...@gmail.com>
To: core-user@hadoop.apache.org
Sent: Thursday, June 18, 2009 9:11:41 PM
Subject: Re: Pipes example wordcount-nopipe.cc failed when reading from input  splits

I did get this working. InputSplit information is not returned clearly. You
may want to look at this thread -
http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200906.mbox/%3Cee216d470906121602k7f914179u5d9555e7bb080edb@mail.gmail.com%3E


On Thu, Jun 18, 2009 at 12:49 AM, Jianmin Woo <ji...@yahoo.com> wrote:

>
> I tried this example and it seems that the input/output should only be in
> "file:///..." format to get correct results.
>
> - Jianmin
>
>
>
>
> ________________________________
> From: Viral K <kh...@yahoo-inc.com>
> To: core-user@hadoop.apache.org
> Sent: Thursday, June 18, 2009 8:57:47 AM
> Subject: Re: Pipes example wordcount-nopipe.cc failed when reading from
> input splits
>
>
> Does anybody have any updates on this?
>
> How can we have our own RecordReader in Hadoop pipes?  When I try to print
> the "context.getInputSplit", I get the filenames along with some junk
> characters.  As a result the file open fails.
>
> Anybody got it working?
>
> Viral.
>
>
>
> 11 Nov. wrote:
> >
> > I traced into the c++ recordreader code:
> >   WordCountReader(HadoopPipes::MapContext& context) {
> >     std::string filename;
> >     HadoopUtils::StringInStream stream(context.getInputSplit());
> >     HadoopUtils::deserializeString(filename, stream);
> >     struct stat statResult;
> >     stat(filename.c_str(), &statResult);
> >     bytesTotal = statResult.st_size;
> >     bytesRead = 0;
> >     cout << filename<<endl;
> >     file = fopen(filename.c_str(), "rt");
> >     HADOOP_ASSERT(file != NULL, "failed to open " + filename);
> >   }
> >
> > I got nothing for the filename virable, which showed the InputSplit is
> > empty.
> >
> > 2008/3/4, 11 Nov. <no...@gmail.com>:
> >>
> >> hi colleagues,
> >>    I have set up the single node cluster to test pipes examples.
> >>    wordcount-simple and wordcount-part work just fine. but
> >> wordcount-nopipe can't run. Here is my commnad line:
> >>
> >>  bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input
> >> input/ -output out-dir-nopipe1
> >>
> >> and here is the error message printed on my console:
> >>
> >> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set.  User
> >> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to
> >> process : 1
> >> 08/03/03 23:23:07 INFO mapred.JobClient: Running job:
> >> job_200803032218_0004
> >> 08/03/03 23:23:08 INFO mapred.JobClient:  map 0% reduce 0%
> >> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id :
> >> task_200803032218_0004_m_000000_0, Status : FAILED
> >> java.io.IOException: pipe child exception
> >>         at org.apache.hadoop.mapred.pipes.Application.abort(
> >> Application.java:138)
> >>         at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(
> >> PipesMapRunner.java:83)
> >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
> >>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
> >> TaskTracker.java:1787)
> >> Caused by: java.io.EOFException
> >>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >>         at
> >> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java
> >> :313)
> >>         at
> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java
> >> :335)
> >>         at
> >> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(
> >> BinaryProtocol.java:112)
> >>
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to
> open
> >> at /home/hadoop/hadoop-0.15.2-single-cluster
> >> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in
> >> WordCountReader::WordCountReader(HadoopPipes::MapContext&)
> >>
> >>
> >> Could anybody tell me how to fix this? That will be appreciated.
> >> Thanks a lot!
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Pipes-example-wordcount-nopipe.cc-failed-when-reading-from-input-splits-tp15807856p24084734.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>
>
>

Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

Posted by Roshan James <ro...@gmail.com>.

I did get this working. InputSplit information is not returned clearly. You
may want to look at this thread -
http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200906.mbox/%3Cee216d470906121602k7f914179u5d9555e7bb080edb@mail.gmail.com%3E


On Thu, Jun 18, 2009 at 12:49 AM, Jianmin Woo <ji...@yahoo.com> wrote:

>
> I tried this example and it seems that the input/output should only be in
> "file:///..." format to get correct results.
>
> - Jianmin
>
>
>
>
> ________________________________
> From: Viral K <kh...@yahoo-inc.com>
> To: core-user@hadoop.apache.org
> Sent: Thursday, June 18, 2009 8:57:47 AM
> Subject: Re: Pipes example wordcount-nopipe.cc failed when reading from
> input splits
>
>
> Does anybody have any updates on this?
>
> How can we have our own RecordReader in Hadoop pipes?  When I try to print
> the "context.getInputSplit", I get the filenames along with some junk
> characters.  As a result the file open fails.
>
> Anybody got it working?
>
> Viral.
>
>
>
> 11 Nov. wrote:
> >
> > I traced into the c++ recordreader code:
> >   WordCountReader(HadoopPipes::MapContext& context) {
> >     std::string filename;
> >     HadoopUtils::StringInStream stream(context.getInputSplit());
> >     HadoopUtils::deserializeString(filename, stream);
> >     struct stat statResult;
> >     stat(filename.c_str(), &statResult);
> >     bytesTotal = statResult.st_size;
> >     bytesRead = 0;
> >     cout << filename<<endl;
> >     file = fopen(filename.c_str(), "rt");
> >     HADOOP_ASSERT(file != NULL, "failed to open " + filename);
> >   }
> >
> > I got nothing for the filename virable, which showed the InputSplit is
> > empty.
> >
> > 2008/3/4, 11 Nov. <no...@gmail.com>:
> >>
> >> hi colleagues,
> >>    I have set up the single node cluster to test pipes examples.
> >>    wordcount-simple and wordcount-part work just fine. but
> >> wordcount-nopipe can't run. Here is my commnad line:
> >>
> >>  bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input
> >> input/ -output out-dir-nopipe1
> >>
> >> and here is the error message printed on my console:
> >>
> >> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set.  User
> >> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to
> >> process : 1
> >> 08/03/03 23:23:07 INFO mapred.JobClient: Running job:
> >> job_200803032218_0004
> >> 08/03/03 23:23:08 INFO mapred.JobClient:  map 0% reduce 0%
> >> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id :
> >> task_200803032218_0004_m_000000_0, Status : FAILED
> >> java.io.IOException: pipe child exception
> >>         at org.apache.hadoop.mapred.pipes.Application.abort(
> >> Application.java:138)
> >>         at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(
> >> PipesMapRunner.java:83)
> >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
> >>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
> >> TaskTracker.java:1787)
> >> Caused by: java.io.EOFException
> >>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
> >>         at
> >> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java
> >> :313)
> >>         at
> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java
> >> :335)
> >>         at
> >> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(
> >> BinaryProtocol.java:112)
> >>
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0:
> >> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to
> open
> >> at /home/hadoop/hadoop-0.15.2-single-cluster
> >> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in
> >> WordCountReader::WordCountReader(HadoopPipes::MapContext&)
> >>
> >>
> >> Could anybody tell me how to fix this? That will be appreciated.
> >> Thanks a lot!
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Pipes-example-wordcount-nopipe.cc-failed-when-reading-from-input-splits-tp15807856p24084734.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>
>
>

Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

Posted by Jianmin Woo <ji...@yahoo.com>.

I tried this example and it seems that the input/output should only be in "file:///..." format to get correct results.

- Jianmin




________________________________
From: Viral K <kh...@yahoo-inc.com>
To: core-user@hadoop.apache.org
Sent: Thursday, June 18, 2009 8:57:47 AM
Subject: Re: Pipes example wordcount-nopipe.cc failed when reading from input splits


Does anybody have any updates on this?

How can we have our own RecordReader in Hadoop pipes?  When I try to print
the "context.getInputSplit", I get the filenames along with some junk
characters.  As a result the file open fails.

Anybody got it working?

Viral.



11 Nov. wrote:
> 
> I traced into the c++ recordreader code:
>   WordCountReader(HadoopPipes::MapContext& context) {
>     std::string filename;
>     HadoopUtils::StringInStream stream(context.getInputSplit());
>     HadoopUtils::deserializeString(filename, stream);
>     struct stat statResult;
>     stat(filename.c_str(), &statResult);
>     bytesTotal = statResult.st_size;
>     bytesRead = 0;
>     cout << filename<<endl;
>     file = fopen(filename.c_str(), "rt");
>     HADOOP_ASSERT(file != NULL, "failed to open " + filename);
>   }
> 
> I got nothing for the filename virable, which showed the InputSplit is
> empty.
> 
> 2008/3/4, 11 Nov. <no...@gmail.com>:
>>
>> hi colleagues,
>>    I have set up the single node cluster to test pipes examples.
>>    wordcount-simple and wordcount-part work just fine. but
>> wordcount-nopipe can't run. Here is my commnad line:
>>
>>  bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input
>> input/ -output out-dir-nopipe1
>>
>> and here is the error message printed on my console:
>>
>> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set.  User
>> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
>> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to
>> process : 1
>> 08/03/03 23:23:07 INFO mapred.JobClient: Running job:
>> job_200803032218_0004
>> 08/03/03 23:23:08 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id :
>> task_200803032218_0004_m_000000_0, Status : FAILED
>> java.io.IOException: pipe child exception
>>         at org.apache.hadoop.mapred.pipes.Application.abort(
>> Application.java:138)
>>         at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(
>> PipesMapRunner.java:83)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
>> TaskTracker.java:1787)
>> Caused by: java.io.EOFException
>>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>         at
>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java
>> :313)
>>         at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java
>> :335)
>>         at
>> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(
>> BinaryProtocol.java:112)
>>
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0:
>> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to open
>> at /home/hadoop/hadoop-0.15.2-single-cluster
>> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in
>> WordCountReader::WordCountReader(HadoopPipes::MapContext&)
>>
>>
>> Could anybody tell me how to fix this? That will be appreciated.
>> Thanks a lot!
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Pipes-example-wordcount-nopipe.cc-failed-when-reading-from-input-splits-tp15807856p24084734.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.