You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Hyunsik Choi (Created) (JIRA)" <ji...@apache.org> on 2011/10/28 04:15:32 UTC

[jira] [Created] (GIRAPH-68) Implement a Graph Generator

Implement a Graph Generator
---------------------------

                 Key: GIRAPH-68
                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
             Project: Giraph
          Issue Type: New Feature
          Components: benchmark
    Affects Versions: 0.70.0
            Reporter: Hyunsik Choi


To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi updated GIRAPH-68:
-------------------------------

    Attachment: GIRAPH-68_2.patch

Avery,

Thank you for review.

I think that the GraphGenerator is necessary to test the overall of IO-related sub systems. For example, *InputFormat and Partitioners can be examined by some generated data set instead of PseudoRandomVertexInputFormat.

As you mentioned, I modified PageRank/RandomMessageBenchmark to use a specified InputFormat and an input path. If the input format and input path are not given, they will work as the current implementation using PseudoRandomVertexInputFormat.
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.70.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch, GIRAPH-68_2.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399021#comment-13399021 ] 

Hyunsik Choi commented on GIRAPH-68:
------------------------------------

I think that it is still relevant. I also intended that this patch uses an identity vertex. This generator is a kind of helper to enable users to do that easily. In addition, this patch enables the existing benchmark to use some input dataset on HDFS instead of PseudoRandonVertexInputFormat. If you agree that, I'll rebase the patch in this weekend.
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.1.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch, GIRAPH-68_2.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi updated GIRAPH-68:
-------------------------------

    Attachment: GIRAPH-68_1.patch

I attached the patch. GraphGenerator class writes a generated graph data into a specific HDFS directory by using existing Input/OutputFormats.
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.70.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-68) Implement a Graph Generator

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398742#comment-13398742 ] 

Jakob Homan commented on GIRAPH-68:
-----------------------------------

Is this still relevant in its current form? Looks like we could accomplish the same thing if we just had an identity vertex that ran for one superstep and wrote out its state, in conjunction with an inputformat that generates a graph?
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.1.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch, GIRAPH-68_2.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-68) Implement a Graph Generator

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151405#comment-13151405 ] 

Avery Ching commented on GIRAPH-68:
-----------------------------------


Looks good Hyunsik, a few comments.

Probably want to add a javadoc comment for GraphGenerator
Lines 40-41: Should have 8 space indenting
Line 46: needs 4 more spaces
Line 58: Over 80 chars

So is the idea that PageRankBenchmark and RandomMessageBenchmark would use it?  Would you like to modify them to do so?
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.70.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi reassigned GIRAPH-68:
----------------------------------

    Assignee: Hyunsik Choi
    
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.70.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404494#comment-13404494 ] 

Hyunsik Choi commented on GIRAPH-68:
------------------------------------

Sorry for late work. I had a second thought. As your suggestion, we can achieve the same thing via GiraphRunner if we had an identity vertex. This way looks simple and better. However, we still need to enable PageRank/RandomMessageBenchmark to take a input format and input paths.

As you suggested, I'll a new jira issue about IdentityVertex, and I would like to change GIRAPH-68 to modify PageRank/RandomMessageBenchmark to take an input format and input paths.

Any other suggestion?
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.1.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch, GIRAPH-68_2.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-68) Implement a Graph Generator

Posted by "Hyunsik Choi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152185#comment-13152185 ] 

Hyunsik Choi commented on GIRAPH-68:
------------------------------------

I missed javadoc. I will reattach the patch including javadoc.
                
> Implement a Graph Generator
> ---------------------------
>
>                 Key: GIRAPH-68
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-68
>             Project: Giraph
>          Issue Type: New Feature
>          Components: benchmark
>    Affects Versions: 0.70.0
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>         Attachments: GIRAPH-68_1.patch, GIRAPH-68_2.patch
>
>
> To provide users with benchmark environments and to deeply test the input/output system of giraph, we need a graph generator. We will enable the graph generator to generate various kinds of graph data sets by specifying a VertexInputFormat and a VertexOutputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira