You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Fridtjof Sander <fs...@mailbox.tu-berlin.de> on 2016/01/28 23:05:45 UTC

How to register aggregation convergence criterion to bulk iteration in scala API?

Hi,

I want to register a custom aggregation convergence criterion to a bulk 
iteration and I want to use the scala API.
It appears to me that this is not possible at the moment, right?

The AggregatorRegistry is exposed by IterativeDataSet.java, which is 
hidden by DataSet.scala:

   def iterate(maxIterations: Int)(stepFunction: (DataSet[T]) => 
DataSet[T]): DataSet[T] = {
     val iterativeSet =
       new IterativeDataSet[T](
         javaSet.getExecutionEnvironment,
         javaSet.getType,
         javaSet,
         maxIterations)

     val resultSet = stepFunction(wrap(iterativeSet))
     val result = iterativeSet.closeWith(resultSet.javaSet)
     wrap(result)
   }

I am aware of the iterateWithTermination-possibility and it's a 
work-around for me, but I guess the aggregated convergence criterion 
would be more efficient.
Am I missing something?

Best,
Fridtjof

Re: How to register aggregation convergence criterion to bulk iteration in scala API?

Posted by Stephan Ewen <se...@apache.org>.
You are right, that is currently missing in the Scala API. Would be good to
add this, for feature completeness in the Scala API.

As a workaround for now: Can you access the Java IterativeDataSet from the
Scala data set, and register it there?

Greetings,
Stephan


On Thu, Jan 28, 2016 at 11:05 PM, Fridtjof Sander <
fsander@mailbox.tu-berlin.de> wrote:

> Hi,
>
> I want to register a custom aggregation convergence criterion to a bulk
> iteration and I want to use the scala API.
> It appears to me that this is not possible at the moment, right?
>
> The AggregatorRegistry is exposed by IterativeDataSet.java, which is
> hidden by DataSet.scala:
>
>   def iterate(maxIterations: Int)(stepFunction: (DataSet[T]) =>
> DataSet[T]): DataSet[T] = {
>     val iterativeSet =
>       new IterativeDataSet[T](
>         javaSet.getExecutionEnvironment,
>         javaSet.getType,
>         javaSet,
>         maxIterations)
>
>     val resultSet = stepFunction(wrap(iterativeSet))
>     val result = iterativeSet.closeWith(resultSet.javaSet)
>     wrap(result)
>   }
>
> I am aware of the iterateWithTermination-possibility and it's a
> work-around for me, but I guess the aggregated convergence criterion would
> be more efficient.
> Am I missing something?
>
> Best,
> Fridtjof
>