You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2012/08/21 01:51:37 UTC

[jira] [Created] (GIRAPH-309) Message count is wrong

Avery Ching created GIRAPH-309:
----------------------------------

             Summary: Message count is wrong
                 Key: GIRAPH-309
                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
             Project: Giraph
          Issue Type: Bug
            Reporter: Avery Ching
            Priority: Minor


Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.

Old incorrect value

2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3

Fixed value

2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (GIRAPH-309) Message count is wrong

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching reassigned GIRAPH-309:
----------------------------------

    Assignee: Avery Ching
    
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Alessandro Presta (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438953#comment-13438953 ] 

Alessandro Presta commented on GIRAPH-309:
------------------------------------------

That's weird:

{code}
java.lang.IllegalStateException: generateInputSplits: Got IOException
	at org.apache.giraph.graph.BspServiceMaster.generateInputSplits(BspServiceMaster.java:262)
	at org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:517)
	at org.apache.giraph.graph.MasterThread.run(MasterThread.java:98)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/tmp/_giraphTests/testContinue
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
	at org.apache.giraph.io.TextVertexInputFormat.getSplits(TextVertexInputFormat.java:120)
	at org.apache.giraph.graph.BspServiceMaster.generateInputSplits(BspServiceMaster.java:242)
	... 2 more
{code}

Let's wait for the next rebuild, as it seems unrelated.
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445449#comment-13445449 ] 

Avery Ching commented on GIRAPH-309:
------------------------------------

This is strange.  I am able to see the number of messages in between supersteps...
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-309) Message count is wrong

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching updated GIRAPH-309:
-------------------------------

    Attachment: GIRAPH-309.patch

The fix is quite simple, no need for rb.
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Alessandro Presta (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438940#comment-13438940 ] 

Alessandro Presta commented on GIRAPH-309:
------------------------------------------

Can't believe we never noticed this before :D
+1, committing. Thanks!
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449019#comment-13449019 ] 

Avery Ching commented on GIRAPH-309:
------------------------------------

Actually, everything in the "Giraph Stats" counters is per superstep.  It would be interesting to have a counter about total messages sent though.
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447415#comment-13447415 ] 

Eli Reisman commented on GIRAPH-309:
------------------------------------

I see what happening now. The algo I was running sends the same # of messages every superstep, it looked frozen at superstep 0. Running other algorithms, I now see the Hadoop JobTracker detail page says "
                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438949#comment-13438949 ] 

Hudson commented on GIRAPH-309:
-------------------------------

Integrated in Giraph-trunk-Commit #182 (See [https://builds.apache.org/job/Giraph-trunk-Commit/182/])
    GIRAPH-309: Message count is wrong. (aching via apresta) (Revision 1375717)

     Result = FAILURE
apresta : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1375717
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceMaster.java

                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452144#comment-13452144 ] 

Eli Reisman commented on GIRAPH-309:
------------------------------------

Yes it would. I was not aware that the total messages sent was supposed to reset each superstep. Either way I have run it using a number of algorithms where I have a good idea what the counts should look like relative to input data size, and its working great now.

Perhaps I will throw up a newbie JIRA to change the name of the counter to reflect that it is a per-superstep count as everyone using Giraph here had taken it to be a cumulative count on first glance as well. And including a total msg counter might not be a bad thing either. Having the per-superstep count has proven really useful, even when I wasn't sure if it was intended or not! Thanks again for the fix.

                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447418#comment-13447418 ] 

Eli Reisman commented on GIRAPH-309:
------------------------------------

sorry (got cut off!) the JT page lists "Sent Messages" which implies running total for all iterations (to me at least) but after running other alga's I see its actually reporting a changing amount that is for each individual superstep, as they occur. They are never tallied up in the JT page. Perhaps all we need to do is keep a running total, or alternately, mark that field as "Sent Messages (Current Superstep)" or something?


                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-309) Message count is wrong

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445039#comment-13445039 ] 

Eli Reisman commented on GIRAPH-309:
------------------------------------

Hey Avery,

Jakob and I were back up on clusters and running yesterday and saw that we now get exactly the right message count for superstep 0 but after that the counter never updates again, even after a job is finished.

                
> Message count is wrong
> ----------------------
>
>                 Key: GIRAPH-309
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-309
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Minor
>         Attachments: GIRAPH-309.patch
>
>
> Currently, the message count is multiplied by the partitions, which is incorrect as it is a total message count for the entire worker.  This affects the Hadoop counter displayed on the job status page.
> Old incorrect value
> 2012-08-16 00:45:12,307 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=599165250,haltComputation=false) on superstep = 3
> Fixed value
> 2012-08-20 16:47:11,559 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=10000000,finVtx=0,edges=100000000,msgCount=100000000,haltComputation=false) on superstep = 3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira