You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Flavio Pompermaier <po...@okkam.it> on 2015/01/27 14:37:32 UTC
Operators chaining as custom functions
Hi guys,
I'd like to know whether it is possible or not save a chain of operators as
a custom function.
For example, If I have a DataSet transformation composed by a map followed
by a reduce (but this is a simple case, I could have a more complex
scenario), is it possible to save it as a custom function (eg. "myFunction"
=> map.reduce) so that I can call mydataset.myFunction() instead of
myDataset.map().reduce()?
Best,
Flavio
Re: Operators chaining as custom functions
Posted by Alexander Alexandrov <al...@gmail.com>.
I don't any reason why the Scala approach should not work in Java. For
example, the flink-graph API seems to be built on top of this concept (in
Java):
https://github.com/project-flink/flink-graph/blob/master/src/main/java/flink/graphs/Graph.java
2015-01-27 23:45 GMT+01:00 Flavio Pompermaier <po...@okkam.it>:
> Hi Stephan, thanks for the response! Is that the only possibility?there's
> no java alternative at the moment?
> On Jan 27, 2015 10:23 PM, "Stephan Ewen" <se...@apache.org> wrote:
>
>> Hi Flavio!
>>
>> In Scala:
>>
>> You can do that, using the "pimp my library" pattern. Define your own
>> data set (MyDataSet) that has the method "myFunction()" and define an
>> implicit conversion from DataSet to MyData set. See here for more details:
>> http://alvinalexander.com/scala/scala-2.10-implicit-class-example
>>
>>
>> In Java
>>
>> 1) You need to define a class MyUtils, there
>>
>> public class MyUtils {
>> public static DataSet[MyType] myFunction(DataSet[String] input) {
>> return input.map(...).reduce(...);
>> }
>> }
>>
>> 2) You can define a custom unary operation:
>>
>> It allows you to write code like "DataSet<MyType> result =
>> data.runOperation(new MyFunction());
>>
>> public class MyFunction implements CustomUnaryOperation<String, MyType> {
>> private DataSet<String> input;
>>
>> public void setInput(DataSet<String> input) {
>> this.input = input
>> }
>>
>> public DataSet<MyType> createResult() {
>> return input.map(...).reduce(...);
>> }
>> }
>>
>>
>> https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/operators/CustomUnaryOperation.java
>>
>> Greetings,
>> Stephan
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jan 27, 2015 at 5:37 AM, Flavio Pompermaier <pompermaier@okkam.it
>> > wrote:
>>
>>> Hi guys,
>>>
>>> I'd like to know whether it is possible or not save a chain of operators
>>> as a custom function.
>>> For example, If I have a DataSet transformation composed by a map
>>> followed by a reduce (but this is a simple case, I could have a more
>>> complex scenario), is it possible to save it as a custom function (eg.
>>> "myFunction" => map.reduce) so that I can call mydataset.myFunction()
>>> instead of myDataset.map().reduce()?
>>>
>>> Best,
>>> Flavio
>>>
>>
>>
Re: Operators chaining as custom functions
Posted by Flavio Pompermaier <po...@okkam.it>.
Hi Stephan, thanks for the response! Is that the only possibility?there's
no java alternative at the moment?
On Jan 27, 2015 10:23 PM, "Stephan Ewen" <se...@apache.org> wrote:
> Hi Flavio!
>
> In Scala:
>
> You can do that, using the "pimp my library" pattern. Define your own data
> set (MyDataSet) that has the method "myFunction()" and define an implicit
> conversion from DataSet to MyData set. See here for more details:
> http://alvinalexander.com/scala/scala-2.10-implicit-class-example
>
>
> In Java
>
> 1) You need to define a class MyUtils, there
>
> public class MyUtils {
> public static DataSet[MyType] myFunction(DataSet[String] input) {
> return input.map(...).reduce(...);
> }
> }
>
> 2) You can define a custom unary operation:
>
> It allows you to write code like "DataSet<MyType> result =
> data.runOperation(new MyFunction());
>
> public class MyFunction implements CustomUnaryOperation<String, MyType> {
> private DataSet<String> input;
>
> public void setInput(DataSet<String> input) {
> this.input = input
> }
>
> public DataSet<MyType> createResult() {
> return input.map(...).reduce(...);
> }
> }
>
>
> https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/operators/CustomUnaryOperation.java
>
> Greetings,
> Stephan
>
>
>
>
>
>
>
>
>
> On Tue, Jan 27, 2015 at 5:37 AM, Flavio Pompermaier <po...@okkam.it>
> wrote:
>
>> Hi guys,
>>
>> I'd like to know whether it is possible or not save a chain of operators
>> as a custom function.
>> For example, If I have a DataSet transformation composed by a map
>> followed by a reduce (but this is a simple case, I could have a more
>> complex scenario), is it possible to save it as a custom function (eg.
>> "myFunction" => map.reduce) so that I can call mydataset.myFunction()
>> instead of myDataset.map().reduce()?
>>
>> Best,
>> Flavio
>>
>
>
Re: Operators chaining as custom functions
Posted by Stephan Ewen <se...@apache.org>.
Hi Flavio!
In Scala:
You can do that, using the "pimp my library" pattern. Define your own data
set (MyDataSet) that has the method "myFunction()" and define an implicit
conversion from DataSet to MyData set. See here for more details:
http://alvinalexander.com/scala/scala-2.10-implicit-class-example
In Java
1) You need to define a class MyUtils, there
public class MyUtils {
public static DataSet[MyType] myFunction(DataSet[String] input) {
return input.map(...).reduce(...);
}
}
2) You can define a custom unary operation:
It allows you to write code like "DataSet<MyType> result =
data.runOperation(new MyFunction());
public class MyFunction implements CustomUnaryOperation<String, MyType> {
private DataSet<String> input;
public void setInput(DataSet<String> input) {
this.input = input
}
public DataSet<MyType> createResult() {
return input.map(...).reduce(...);
}
}
https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/operators/CustomUnaryOperation.java
Greetings,
Stephan
On Tue, Jan 27, 2015 at 5:37 AM, Flavio Pompermaier <po...@okkam.it>
wrote:
> Hi guys,
>
> I'd like to know whether it is possible or not save a chain of operators
> as a custom function.
> For example, If I have a DataSet transformation composed by a map followed
> by a reduce (but this is a simple case, I could have a more complex
> scenario), is it possible to save it as a custom function (eg. "myFunction"
> => map.reduce) so that I can call mydataset.myFunction() instead of
> myDataset.map().reduce()?
>
> Best,
> Flavio
>