You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Aaron Zolnai-Lucas (Jira)" <ji...@apache.org> on 2022/06/01 15:23:00 UTC

[jira] [Updated] (SPARK-39356) Add option to skip initial message in Pregel API

     [ https://issues.apache.org/jira/browse/SPARK-39356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Zolnai-Lucas updated SPARK-39356:
---------------------------------------
    Priority: Minor  (was: Major)

> Add option to skip initial message in Pregel API
> ------------------------------------------------
>
>                 Key: SPARK-39356
>                 URL: https://issues.apache.org/jira/browse/SPARK-39356
>             Project: Spark
>          Issue Type: Improvement
>          Components: GraphX
>    Affects Versions: 3.2.1
>            Reporter: Aaron Zolnai-Lucas
>            Priority: Minor
>              Labels: graphx, pregel
>
> The current (3.2.1) [Pregel API|https://github.com/apache/spark/blob/5a3ba9b0b301a3b0c43f8d0d88e2b6bdce57d0e6/graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala#L117] takes a parameter {{initialMsg: A}} where {{A : scala.reflect.ClassTag}} is the message type for the Pregel iterations. At the start of the iterative process, the user-supplied vertex update method {{vprog}} is called with the initial message.
> However, in some cases, the start point for a message passing scheme is best described by starting with the {{message}} phase rather than the {{vprog}} phase, and in many cases the first message depends on individual vertex data (instead of a static initial message). In these cases, users are forced to add boilerplate to their {{vprog}} function to check if the message received is the {{initialMessage}} and ignore the message (leave the node state unchanged) if it is. This leads to less efficient (due to extra iteration and check) and less readable code.
>  
> My proposed solution is to change {{initialMsg}} to a parameter of type {{Option[A]}} with default {{{}None{}}}, and then inside {{Pregel.apply}} function, set:
> {code:scala}
> var g = initialMsg match {
>   case Some(msg) => graph.mapVertices((vid, vdata) => vprog(vid, vdata, msg))
>   case _ => graph
> }
> {code}
> This way, the user chooses whether to start the iteration from the {{message}} or {{vprog}} phase. I believe this small change could improve user code readability and efficiency.
> Note: The signature of {{GraphOps.pregel}} would have to be changed to match
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org