You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by Kostas Xirog <k0...@gmail.com> on 2013/07/10 09:31:42 UTC

Load Graph into memory ?

Hello,

  I've written this application that creates a graph and runs a query on
this graph.I want to test this application with very large amounts of data.
The graph is created using data that's structured into records.
When I have 5 million records for instance, it takes about 5 minutes for
hama to create the graph for this data. I want to load 100 million records
into the system. Is there a way I can do that only once and then run
programs on the existing graph?

Thank you in advance,
Kostas X.

Re: Load Graph into memory ?

Posted by "Edward J. Yoon" <ed...@apache.org>.
Please try to use DiskVerticesInfo and DiskQueue. It is in my TODO
list but have no time for checking them.

And, if you have an interested in solving problems and contributing
your codes to Apache Hama, Please feel free to submit your patches.
:-)

On Wed, Jul 10, 2013 at 9:04 PM, Kostas Xirog <k0...@gmail.com> wrote:
> Thanks for the quick reply,I understand.
> As far as the memory problem goes, is there any way for me to fix it? I'm
> running the program on an 8 core machine  with 8 gb of RAM ...
>
> Thanks in advance,
> Kostas X.
>
>
> On Wed, Jul 10, 2013 at 2:54 PM, Edward J. Yoon <ed...@apache.org>wrote:
>
>> >> hama to create the graph for this data. I want to load 100 million
>> records
>> >> into the system. Is there a way I can do that only once and then run
>>
>> P.S., you will face the memory problem if you don't have large cluster.
>>
>> On Wed, Jul 10, 2013 at 8:33 PM, Edward J. Yoon <ed...@apache.org>
>> wrote:
>> > If you mean the reuse of already partitioned data, yes. But not support
>> yet.
>> >
>> > If you mean the reuse of vertices in memory (skip the loading phase),
>> > no. Since the system terminates the program automatically when no more
>> > updates occur, you cannot initialize the vertex values and cannot run
>> > same algorithm again. However, I think we can consider new API for
>> > initializing and reusing vertices.
>> >
>> > On Wed, Jul 10, 2013 at 4:31 PM, Kostas Xirog <k0...@gmail.com>
>> wrote:
>> >> Hello,
>> >>
>> >>   I've written this application that creates a graph and runs a query on
>> >> this graph.I want to test this application with very large amounts of
>> data.
>> >> The graph is created using data that's structured into records.
>> >> When I have 5 million records for instance, it takes about 5 minutes for
>> >> hama to create the graph for this data. I want to load 100 million
>> records
>> >> into the system. Is there a way I can do that only once and then run
>> >> programs on the existing graph?
>> >>
>> >> Thank you in advance,
>> >> Kostas X.
>> >
>> >
>> >
>> > --
>> > Best Regards, Edward J. Yoon
>> > @eddieyoon
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Load Graph into memory ?

Posted by Kostas Xirog <k0...@gmail.com>.
Thanks for the quick reply,I understand.
As far as the memory problem goes, is there any way for me to fix it? I'm
running the program on an 8 core machine  with 8 gb of RAM ...

Thanks in advance,
Kostas X.


On Wed, Jul 10, 2013 at 2:54 PM, Edward J. Yoon <ed...@apache.org>wrote:

> >> hama to create the graph for this data. I want to load 100 million
> records
> >> into the system. Is there a way I can do that only once and then run
>
> P.S., you will face the memory problem if you don't have large cluster.
>
> On Wed, Jul 10, 2013 at 8:33 PM, Edward J. Yoon <ed...@apache.org>
> wrote:
> > If you mean the reuse of already partitioned data, yes. But not support
> yet.
> >
> > If you mean the reuse of vertices in memory (skip the loading phase),
> > no. Since the system terminates the program automatically when no more
> > updates occur, you cannot initialize the vertex values and cannot run
> > same algorithm again. However, I think we can consider new API for
> > initializing and reusing vertices.
> >
> > On Wed, Jul 10, 2013 at 4:31 PM, Kostas Xirog <k0...@gmail.com>
> wrote:
> >> Hello,
> >>
> >>   I've written this application that creates a graph and runs a query on
> >> this graph.I want to test this application with very large amounts of
> data.
> >> The graph is created using data that's structured into records.
> >> When I have 5 million records for instance, it takes about 5 minutes for
> >> hama to create the graph for this data. I want to load 100 million
> records
> >> into the system. Is there a way I can do that only once and then run
> >> programs on the existing graph?
> >>
> >> Thank you in advance,
> >> Kostas X.
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Re: Load Graph into memory ?

Posted by "Edward J. Yoon" <ed...@apache.org>.
>> hama to create the graph for this data. I want to load 100 million records
>> into the system. Is there a way I can do that only once and then run

P.S., you will face the memory problem if you don't have large cluster.

On Wed, Jul 10, 2013 at 8:33 PM, Edward J. Yoon <ed...@apache.org> wrote:
> If you mean the reuse of already partitioned data, yes. But not support yet.
>
> If you mean the reuse of vertices in memory (skip the loading phase),
> no. Since the system terminates the program automatically when no more
> updates occur, you cannot initialize the vertex values and cannot run
> same algorithm again. However, I think we can consider new API for
> initializing and reusing vertices.
>
> On Wed, Jul 10, 2013 at 4:31 PM, Kostas Xirog <k0...@gmail.com> wrote:
>> Hello,
>>
>>   I've written this application that creates a graph and runs a query on
>> this graph.I want to test this application with very large amounts of data.
>> The graph is created using data that's structured into records.
>> When I have 5 million records for instance, it takes about 5 minutes for
>> hama to create the graph for this data. I want to load 100 million records
>> into the system. Is there a way I can do that only once and then run
>> programs on the existing graph?
>>
>> Thank you in advance,
>> Kostas X.
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Load Graph into memory ?

Posted by "Edward J. Yoon" <ed...@apache.org>.
If you mean the reuse of already partitioned data, yes. But not support yet.

If you mean the reuse of vertices in memory (skip the loading phase),
no. Since the system terminates the program automatically when no more
updates occur, you cannot initialize the vertex values and cannot run
same algorithm again. However, I think we can consider new API for
initializing and reusing vertices.

On Wed, Jul 10, 2013 at 4:31 PM, Kostas Xirog <k0...@gmail.com> wrote:
> Hello,
>
>   I've written this application that creates a graph and runs a query on
> this graph.I want to test this application with very large amounts of data.
> The graph is created using data that's structured into records.
> When I have 5 million records for instance, it takes about 5 minutes for
> hama to create the graph for this data. I want to load 100 million records
> into the system. Is there a way I can do that only once and then run
> programs on the existing graph?
>
> Thank you in advance,
> Kostas X.



-- 
Best Regards, Edward J. Yoon
@eddieyoon