You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by andronat_asf <an...@hotmail.com> on 2013/03/09 19:27:30 UTC

Write Hama GraphFile problem

Hello everyone,

I was reading the tutorial for Google Web dataset (local mode, pseudo distributed cluser) at http://wiki.apache.org/hama/WriteHamaGraphFile.

I downloaded a Graph (~1GB) and uploaded it to hdfs. The file was splitted in 17 hdfs chunks. (I remind that I am trying to run in pseudo distributed mode, I have everything in my laptop).

ls -l /tmp/hadoop/dfs/data/current/  | grep -v 'meta' | wc -l
      17

Then I tried to run a code I wrote based on the example but:

# tail -f hama-my-bspmaster-my.local.log 

2013-03-09 20:09:47,060 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 17
2013-03-09 20:09:47,100 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Could not schedule all tasks!
2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Hama Graph Loader could not be done successfully. Killing it!

My configuration is:

HamaConfiguration conf = new HamaConfiguration(new Configuration());
conf.set("bsp.local.tasks.maximum", "1");

GraphJob graphJob = new GraphJob(conf, HamaGraphLoader.class);
graphJob.setNumBspTask(1);

I also tried to change the values of:

conf.set("bsp.tasks.maximum", "1");
conf.set("bsp.max.tasks.per.job", "1");
conf.set("mapred.map.tasks", "1");
conf.set("mapred.min.split.size", String.valueOf(Long.MAX_VALUE));

I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.

I attach my full configuration just in case.

Thank you in advance,
Anastasis


Re: Write Hama GraphFile problem

Posted by andronat_asf <an...@hotmail.com>.
Thank you very much.

I will try to limit the size of the graph when I'm testing on pseudo distributed mode.

Kindly,
Anastasis

On 11 Μαρ 2013, at 11:51 π.μ., Thomas Jungblut <th...@gmail.com> wrote:

> Hi,
> 
> I guess you confuse the modes with each other.
> There is the localmode where no daemons are started and the pseudo distributed mode where you are apparently in. More info about the modes can be found here [1] or in our user documentation that is linked somewhere near [1].
> 
> Also the data is splitted by the blocks in HDFS, so if you want to run a single task you have to make it a single block by overriding its blocksize to 1gb.
> Note that even if you make it run, it may use lots of memory. I guess those problems are fixed within the next major release, current trunk version is not stable with graph computing since the partitioning is very strange and will be fixed soon.
> 
> Sorry for the inconvenience.
> 
> [1] http://wiki.apache.org/hama/GettingStarted#Modes

On 11 Μαρ 2013, at 11:49 π.μ., Edward J. Yoon <ed...@apache.org> wrote:

> I think there's a problem in your configuration. Please check whether
> Pi example works well.
> 
>>>>>> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.
> 
> In 0.6 release, task number is defined as a number of blocks by
> default. If you want to force it to 1, please use the TRUNK version.
> 
> On Mon, Mar 11, 2013 at 6:39 PM, andronat_asf <an...@hotmail.com> wrote:
>> Hello again,
>> 
>> I am running Hama v0.6.0 with Hadoop v1.0.4.
>> 
>> I checked groom logs, but groom didn't writing any logs. Also no task logs exists due to the fact that my job is failing too early. I tried to change the log level (in hama-daemon.sh export HAMA_ROOT_LOGGER="INFO,DRFA" to DEBUG) but nothing useful came up.
>> 
>> When I was playing around, I changed in conf/hama-default.xml the bsp.tasks.maximum to 20 (I know that I'm suppose to overwrite variables to hama-site.xml). And everything start working.
>> 
>> Is this expected to work like this? As far as I understand, for some reason I can't overwrite the configuration through my code. Also, what about the guide (http://wiki.apache.org/hama/WriteHamaGraphFile) is it suppose to work with 1 task?
>> 
>> Thanks again,
>> Anastasis
>> 
>> P.S.
>> I noticed that, when I was trying to change the debug level of Hama through log4j.properties, nothing happened. Is that ok too? Or is something wrong with my installation?
>> 
>> On 10 Μαρ 2013, at 12:46 π.μ., Edward J. Yoon <ed...@apache.org> wrote:
>> 
>>> What version of Hama? and you need to check the groom-server logs and task logs.
>>> 
>>> On Sun, Mar 10, 2013 at 3:27 AM, andronat_asf <an...@hotmail.com> wrote:
>>>> Hello everyone,
>>>> 
>>>> I was reading the tutorial for Google Web dataset (local mode, pseudo distributed cluser) at http://wiki.apache.org/hama/WriteHamaGraphFile.
>>>> 
>>>> I downloaded a Graph (~1GB) and uploaded it to hdfs. The file was splitted in 17 hdfs chunks. (I remind that I am trying to run in pseudo distributed mode, I have everything in my laptop).
>>>> 
>>>> ls -l /tmp/hadoop/dfs/data/current/  | grep -v 'meta' | wc -l
>>>>     17
>>>> 
>>>> Then I tried to run a code I wrote based on the example but:
>>>> 
>>>> # tail -f hama-my-bspmaster-my.local.log
>>>> 
>>>> 2013-03-09 20:09:47,060 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 17
>>>> 2013-03-09 20:09:47,100 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
>>>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Could not schedule all tasks!
>>>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Hama Graph Loader could not be done successfully. Killing it!
>>>> 
>>>> My configuration is:
>>>> 
>>>> HamaConfiguration conf = new HamaConfiguration(new Configuration());
>>>> conf.set("bsp.local.tasks.maximum", "1");
>>>> 
>>>> GraphJob graphJob = new GraphJob(conf, HamaGraphLoader.class);
>>>> graphJob.setNumBspTask(1);
>>>> 
>>>> I also tried to change the values of:
>>>> 
>>>> conf.set("bsp.tasks.maximum", "1");
>>>> conf.set("bsp.max.tasks.per.job", "1");
>>>> conf.set("mapred.map.tasks", "1");
>>>> conf.set("mapred.min.split.size", String.valueOf(Long.MAX_VALUE));
>>>> 
>>>> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.
>>>> 
>>>> I attach my full configuration just in case.
>>>> 
>>>> Thank you in advance,
>>>> Anastasis
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>> 
>> 
> 
> 
> 
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon
> 


Re: Write Hama GraphFile problem

Posted by "Edward J. Yoon" <ed...@apache.org>.
I think there's a problem in your configuration. Please check whether
Pi example works well.

>> >>> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.

In 0.6 release, task number is defined as a number of blocks by
default. If you want to force it to 1, please use the TRUNK version.

On Mon, Mar 11, 2013 at 6:39 PM, andronat_asf <an...@hotmail.com> wrote:
> Hello again,
>
> I am running Hama v0.6.0 with Hadoop v1.0.4.
>
> I checked groom logs, but groom didn't writing any logs. Also no task logs exists due to the fact that my job is failing too early. I tried to change the log level (in hama-daemon.sh export HAMA_ROOT_LOGGER="INFO,DRFA" to DEBUG) but nothing useful came up.
>
> When I was playing around, I changed in conf/hama-default.xml the bsp.tasks.maximum to 20 (I know that I'm suppose to overwrite variables to hama-site.xml). And everything start working.
>
> Is this expected to work like this? As far as I understand, for some reason I can't overwrite the configuration through my code. Also, what about the guide (http://wiki.apache.org/hama/WriteHamaGraphFile) is it suppose to work with 1 task?
>
> Thanks again,
> Anastasis
>
> P.S.
> I noticed that, when I was trying to change the debug level of Hama through log4j.properties, nothing happened. Is that ok too? Or is something wrong with my installation?
>
> On 10 Μαρ 2013, at 12:46 π.μ., Edward J. Yoon <ed...@apache.org> wrote:
>
>> What version of Hama? and you need to check the groom-server logs and task logs.
>>
>> On Sun, Mar 10, 2013 at 3:27 AM, andronat_asf <an...@hotmail.com> wrote:
>>> Hello everyone,
>>>
>>> I was reading the tutorial for Google Web dataset (local mode, pseudo distributed cluser) at http://wiki.apache.org/hama/WriteHamaGraphFile.
>>>
>>> I downloaded a Graph (~1GB) and uploaded it to hdfs. The file was splitted in 17 hdfs chunks. (I remind that I am trying to run in pseudo distributed mode, I have everything in my laptop).
>>>
>>> ls -l /tmp/hadoop/dfs/data/current/  | grep -v 'meta' | wc -l
>>>      17
>>>
>>> Then I tried to run a code I wrote based on the example but:
>>>
>>> # tail -f hama-my-bspmaster-my.local.log
>>>
>>> 2013-03-09 20:09:47,060 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 17
>>> 2013-03-09 20:09:47,100 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
>>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Could not schedule all tasks!
>>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Hama Graph Loader could not be done successfully. Killing it!
>>>
>>> My configuration is:
>>>
>>> HamaConfiguration conf = new HamaConfiguration(new Configuration());
>>> conf.set("bsp.local.tasks.maximum", "1");
>>>
>>> GraphJob graphJob = new GraphJob(conf, HamaGraphLoader.class);
>>> graphJob.setNumBspTask(1);
>>>
>>> I also tried to change the values of:
>>>
>>> conf.set("bsp.tasks.maximum", "1");
>>> conf.set("bsp.max.tasks.per.job", "1");
>>> conf.set("mapred.map.tasks", "1");
>>> conf.set("mapred.min.split.size", String.valueOf(Long.MAX_VALUE));
>>>
>>> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.
>>>
>>> I attach my full configuration just in case.
>>>
>>> Thank you in advance,
>>> Anastasis
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Write Hama GraphFile problem

Posted by andronat_asf <an...@hotmail.com>.
Hello again,

I am running Hama v0.6.0 with Hadoop v1.0.4.

I checked groom logs, but groom didn't writing any logs. Also no task logs exists due to the fact that my job is failing too early. I tried to change the log level (in hama-daemon.sh export HAMA_ROOT_LOGGER="INFO,DRFA" to DEBUG) but nothing useful came up.

When I was playing around, I changed in conf/hama-default.xml the bsp.tasks.maximum to 20 (I know that I'm suppose to overwrite variables to hama-site.xml). And everything start working.

Is this expected to work like this? As far as I understand, for some reason I can't overwrite the configuration through my code. Also, what about the guide (http://wiki.apache.org/hama/WriteHamaGraphFile) is it suppose to work with 1 task?

Thanks again,
Anastasis

P.S.
I noticed that, when I was trying to change the debug level of Hama through log4j.properties, nothing happened. Is that ok too? Or is something wrong with my installation?

On 10 Μαρ 2013, at 12:46 π.μ., Edward J. Yoon <ed...@apache.org> wrote:

> What version of Hama? and you need to check the groom-server logs and task logs.
> 
> On Sun, Mar 10, 2013 at 3:27 AM, andronat_asf <an...@hotmail.com> wrote:
>> Hello everyone,
>> 
>> I was reading the tutorial for Google Web dataset (local mode, pseudo distributed cluser) at http://wiki.apache.org/hama/WriteHamaGraphFile.
>> 
>> I downloaded a Graph (~1GB) and uploaded it to hdfs. The file was splitted in 17 hdfs chunks. (I remind that I am trying to run in pseudo distributed mode, I have everything in my laptop).
>> 
>> ls -l /tmp/hadoop/dfs/data/current/  | grep -v 'meta' | wc -l
>>      17
>> 
>> Then I tried to run a code I wrote based on the example but:
>> 
>> # tail -f hama-my-bspmaster-my.local.log
>> 
>> 2013-03-09 20:09:47,060 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 17
>> 2013-03-09 20:09:47,100 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Could not schedule all tasks!
>> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Hama Graph Loader could not be done successfully. Killing it!
>> 
>> My configuration is:
>> 
>> HamaConfiguration conf = new HamaConfiguration(new Configuration());
>> conf.set("bsp.local.tasks.maximum", "1");
>> 
>> GraphJob graphJob = new GraphJob(conf, HamaGraphLoader.class);
>> graphJob.setNumBspTask(1);
>> 
>> I also tried to change the values of:
>> 
>> conf.set("bsp.tasks.maximum", "1");
>> conf.set("bsp.max.tasks.per.job", "1");
>> conf.set("mapred.map.tasks", "1");
>> conf.set("mapred.min.split.size", String.valueOf(Long.MAX_VALUE));
>> 
>> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.
>> 
>> I attach my full configuration just in case.
>> 
>> Thank you in advance,
>> Anastasis
>> 
> 
> 
> 
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon
> 


Re: Write Hama GraphFile problem

Posted by "Edward J. Yoon" <ed...@apache.org>.
What version of Hama? and you need to check the groom-server logs and task logs.

On Sun, Mar 10, 2013 at 3:27 AM, andronat_asf <an...@hotmail.com> wrote:
> Hello everyone,
>
> I was reading the tutorial for Google Web dataset (local mode, pseudo distributed cluser) at http://wiki.apache.org/hama/WriteHamaGraphFile.
>
> I downloaded a Graph (~1GB) and uploaded it to hdfs. The file was splitted in 17 hdfs chunks. (I remind that I am trying to run in pseudo distributed mode, I have everything in my laptop).
>
> ls -l /tmp/hadoop/dfs/data/current/  | grep -v 'meta' | wc -l
>       17
>
> Then I tried to run a code I wrote based on the example but:
>
> # tail -f hama-my-bspmaster-my.local.log
>
> 2013-03-09 20:09:47,060 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 17
> 2013-03-09 20:09:47,100 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Could not schedule all tasks!
> 2013-03-09 20:09:47,103 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Hama Graph Loader could not be done successfully. Killing it!
>
> My configuration is:
>
> HamaConfiguration conf = new HamaConfiguration(new Configuration());
> conf.set("bsp.local.tasks.maximum", "1");
>
> GraphJob graphJob = new GraphJob(conf, HamaGraphLoader.class);
> graphJob.setNumBspTask(1);
>
> I also tried to change the values of:
>
> conf.set("bsp.tasks.maximum", "1");
> conf.set("bsp.max.tasks.per.job", "1");
> conf.set("mapred.map.tasks", "1");
> conf.set("mapred.min.split.size", String.valueOf(Long.MAX_VALUE));
>
> I even changed the according variables from hama-default.xml but nothing seems to be working. The number of BSPTasks remain to 17.
>
> I attach my full configuration just in case.
>
> Thank you in advance,
> Anastasis
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon