You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by ri...@sina.cn on 2016/09/30 08:13:18 UTC

回复:How to Broadcast a very large model object (used in iterative scoring in recommendation system) in Flink

your message is very short,i can not read more.the follow is my guss,
    in flink,the dataStream is not for iterative computation,the dataSet would be more well.and fink suggest broadcast mini data,not large.

   your can load your model data (it can be from file,or table),before main function,andassignment to variable ,like name=yourModel.
 and the dataStream(it is a stream,unscored record,like DataStream[String] or DataStream[yourClass]),
and dataStream.map{x=>
  val score = computeScore(x,yourModel) 
}

object YourObject {

load your model 
val yourModel = ;

def main(){
   ...............
    read unscoreed record,from socket or kafka,or ....

     dataStream.map{x=>
      val score = computeScore(x,yourModel) 
    }
   ......
}
}
----- 原始邮件 -----
发件人:Anchit Jatana <de...@gmail.com>
收件人:user@flink.apache.org
主题:How to Broadcast a very large model object (used in iterative scoring in recommendation system) in Flink
日期:2016年09月30日 14点15分

Hi All,
I&#39;m building a recommendation system streaming application for which I need to broadcast a very large model object (used in iterative scoring) among all the task managers performing the operation parallely for the operator
I&#39;m doing an this operation in map1 of CoMapFunction. Please suggest me some way to achieve the broadcasting of the large model variable (something similar to what Spark has with broadcast variables).
Thank you
Regards,Anchit