You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Panagiotis Liakos <p....@di.uoa.gr> on 2015/07/15 10:15:36 UTC

Error executing SimplePageRankComputation

Hello,

I have successfully set up hadoop 1.2.1 and giraph 1.1.0 and tested
the SimpleShortestPathsComputation as shown in the quick start guide.

However, I am more interested in the SimplePageRankComputation and I
fail to execute it.

In particular, I issue the following command:
$HADOOP_HOME/bin/hadoop jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-1.2.1-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimplePageRankComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/hduser/input/tiny_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/hduser/output/pagerank -w 1

I use the same graph and format as with the
SimpleShortestPathsComputation task. The job is successfully submitted
but fails after 10 minutes with the following task error:

Task attempt_201507131607_0013_m_000000_0 failed to report status for
600 seconds. Killing!

Am I doing something wrong? Should I provide more arguments when I
submit the task?

Thank you,
Panagiotis

The output is the following:

15/07/15 10:43:11 INFO utils.ConfigurationUtils: No edge input format
specified. Ensure your InputFormat does not require one.
15/07/15 10:43:11 INFO utils.ConfigurationUtils: No edge output format
specified. Ensure your OutputFormat does not require one.
15/07/15 10:43:12 INFO job.GiraphJob: run: Since checkpointing is
disabled (default), do not allow any task retries (setting
mapred.map.max.attempts = 0, old value = 4)
15/07/15 10:43:13 INFO job.GiraphJob: Tracking URL:
http://hdnode01:50030/jobdetails.jsp?jobid=job_201507131607_0013
15/07/15 10:43:13 INFO job.GiraphJob: Waiting for resources... Job
will start only when it gets all 2 mappers
15/07/15 10:44:24 INFO
job.HaltApplicationUtils$DefaultHaltInstructionsWriter:
writeHaltInstructions: To halt after next superstep execute:
'bin/halt-application --zkServer giraph01:22181 --zkNode
/_hadoopBsp/job_201507131607_0013/_haltComputation'
15/07/15 10:44:24 INFO mapred.JobClient: Running job: job_201507131607_0013
15/07/15 10:44:25 INFO mapred.JobClient:  map 100% reduce 0%
15/07/15 10:53:32 INFO mapred.JobClient:  map 0% reduce 0%
15/07/15 10:53:33 INFO mapred.JobClient: Job complete: job_201507131607_0013
15/07/15 10:53:33 INFO mapred.JobClient: Counters: 6
15/07/15 10:53:33 INFO mapred.JobClient:   Job Counters
15/07/15 10:53:33 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=616166
15/07/15 10:53:33 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
15/07/15 10:53:33 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
15/07/15 10:53:33 INFO mapred.JobClient:     Launched map tasks=2
15/07/15 10:53:33 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
15/07/15 10:53:33 INFO mapred.JobClient:     Failed map tasks=1

Re: Error executing SimplePageRankComputation

Posted by Panagiotis Liakos <P....@di.uoa.gr>.
Thank you for the reply Sonja,

I am following the quick start guide in the Apache Giraph website so I'm
working in a single-node, pseudo-distributed cluster.

I have no idea why there are 2 map tasks, my configuration is the following:

<property>
<name>mapred.job.tracker</name>
<value>hdnode01:54311</value>
</property>

<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>

<property>
<name>mapred.map.tasks</name>
<value>4</value>
</property>

There is an exception in the tasklog with the error I posted in my
previous e-mail: Task attempt_201507131607_0013_m_000000_0 failed to
report status for 600 seconds. Killing!

However, I should also say that I managed to execute the example a little
earlier by additionally providing the -masterCompute argument (-mc
org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute)

I wonder if this is a prerequisite for the SimplePageRankComputation and
why it is not the case with SimpleShortestPathsComputation. Does anyone
have any idea on that?

Thank you,
Panagiotis

> Hi there!
>
> I'm guessing you're working in a pseudo-distributed mode, since you
> applied the argument -w 1
> Have you taken a look into the tasklog? Sounds like you might have some
> excpetion and/or connection loss.
>
> Also I am wondering why your computation launched 2 map tasks.
> What is your configuration in mapred-site.xml?
>
> Greetings!
> Sonja
>



Re: Error executing SimplePageRankComputation

Posted by Sonja Koenig <so...@uni-ulm.de>.
Hi there!

I'm guessing you're working in a pseudo-distributed mode, since you 
applied the argument -w 1
Have you taken a look into the tasklog? Sounds like you might have some 
excpetion and/or connection loss.

Also I am wondering why your computation launched 2 map tasks.
What is your configuration in mapred-site.xml?

Greetings!
Sonja

Am 15.07.2015 um 10:15 schrieb Panagiotis Liakos:
> Hello,
>
> I have successfully set up hadoop 1.2.1 and giraph 1.1.0 and tested
> the SimpleShortestPathsComputation as shown in the quick start guide.
>
> However, I am more interested in the SimplePageRankComputation and I
> fail to execute it.
>
> In particular, I issue the following command:
> $HADOOP_HOME/bin/hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-1.2.1-jar-with-dependencies.jar
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimplePageRankComputation -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/hduser/input/tiny_graph.txt -vof
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> /user/hduser/output/pagerank -w 1
>
> I use the same graph and format as with the
> SimpleShortestPathsComputation task. The job is successfully submitted
> but fails after 10 minutes with the following task error:
>
> Task attempt_201507131607_0013_m_000000_0 failed to report status for
> 600 seconds. Killing!
>
> Am I doing something wrong? Should I provide more arguments when I
> submit the task?
>
> Thank you,
> Panagiotis
>
> The output is the following:
>
> 15/07/15 10:43:11 INFO utils.ConfigurationUtils: No edge input format
> specified. Ensure your InputFormat does not require one.
> 15/07/15 10:43:11 INFO utils.ConfigurationUtils: No edge output format
> specified. Ensure your OutputFormat does not require one.
> 15/07/15 10:43:12 INFO job.GiraphJob: run: Since checkpointing is
> disabled (default), do not allow any task retries (setting
> mapred.map.max.attempts = 0, old value = 4)
> 15/07/15 10:43:13 INFO job.GiraphJob: Tracking URL:
> http://hdnode01:50030/jobdetails.jsp?jobid=job_201507131607_0013
> 15/07/15 10:43:13 INFO job.GiraphJob: Waiting for resources... Job
> will start only when it gets all 2 mappers
> 15/07/15 10:44:24 INFO
> job.HaltApplicationUtils$DefaultHaltInstructionsWriter:
> writeHaltInstructions: To halt after next superstep execute:
> 'bin/halt-application --zkServer giraph01:22181 --zkNode
> /_hadoopBsp/job_201507131607_0013/_haltComputation'
> 15/07/15 10:44:24 INFO mapred.JobClient: Running job: job_201507131607_0013
> 15/07/15 10:44:25 INFO mapred.JobClient:  map 100% reduce 0%
> 15/07/15 10:53:32 INFO mapred.JobClient:  map 0% reduce 0%
> 15/07/15 10:53:33 INFO mapred.JobClient: Job complete: job_201507131607_0013
> 15/07/15 10:53:33 INFO mapred.JobClient: Counters: 6
> 15/07/15 10:53:33 INFO mapred.JobClient:   Job Counters
> 15/07/15 10:53:33 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=616166
> 15/07/15 10:53:33 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 15/07/15 10:53:33 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
> 15/07/15 10:53:33 INFO mapred.JobClient:     Launched map tasks=2
> 15/07/15 10:53:33 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 15/07/15 10:53:33 INFO mapred.JobClient:     Failed map tasks=1