You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Attila Bernáth <be...@gmail.com> on 2014/12/03 15:49:25 UTC
partition identifier
Dear Developers,
Datasets are partitioned between machines. I wonder if there is a way
to get some identifier of a partition. I see that the class
HashPartition has a getPartitionNumber method, but I don't see how I
could use this.
(For example, I would like to see the partition identifier in a
MapFunction, or in a MapPartitionFunction).
Attila
Re: partition identifier
Posted by Attila Bernáth <be...@gmail.com>.
I think I have found it: it must be
getRuntimeContext().getIndexOfThisSubtask();
Attila
2014-12-03 16:12 GMT+01:00 Attila Bernáth <be...@gmail.com>:
> Thank you, Stephan.
> How to access the partition number from the RuntimeContext?
>
> Attila
>
> 2014-12-03 15:53 GMT+01:00 Stephan Ewen <se...@apache.org>:
>> Hey!
>>
>> Here is a brief description how to use rich functions:
>> http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#passing-functions-to-flink
>>
>> Greetings,
>> Stephan
>>
>>
>> On Wed, Dec 3, 2014 at 3:52 PM, Stephan Ewen <se...@apache.org> wrote:
>>>
>>> Hi!
>>>
>>> You can always use the "rich" version of the function, for example the
>>> "RichMapFunction". Inside that function, you can call "getRuntimeContext()",
>>> which gives you access to many things, among them the partition number.
>>>
>>> Stephan
>>>
>>>
>>> On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
>>> wrote:
>>>>
>>>> Dear Developers,
>>>>
>>>> Datasets are partitioned between machines. I wonder if there is a way
>>>> to get some identifier of a partition. I see that the class
>>>> HashPartition has a getPartitionNumber method, but I don't see how I
>>>> could use this.
>>>> (For example, I would like to see the partition identifier in a
>>>> MapFunction, or in a MapPartitionFunction).
>>>>
>>>> Attila
>>>
>>>
>>
Re: partition identifier
Posted by Attila Bernáth <be...@gmail.com>.
I am trying to write some code that is cleverer than the optimizer.
The idea is that in spargel you often want to send the same message to
many other graph nodes. These target nodes are partitioned between the
machines of your cluster, and it would make sense to send the message
to a target machine only once, and then it would distribute it to the
nodes it is holding.
Attila
2014-12-03 16:21 GMT+01:00 Aljoscha Krettek <al...@apache.org>:
> RuntimeContext.getIndexOfThisSubtask()
>
> What do you want to use this partition number for? If I may ask.
>
> Cheers,
> Aljoscha
>
> On Wed, Dec 3, 2014 at 4:12 PM, Attila Bernáth <be...@gmail.com> wrote:
>> Thank you, Stephan.
>> How to access the partition number from the RuntimeContext?
>>
>> Attila
>>
>> 2014-12-03 15:53 GMT+01:00 Stephan Ewen <se...@apache.org>:
>>> Hey!
>>>
>>> Here is a brief description how to use rich functions:
>>> http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#passing-functions-to-flink
>>>
>>> Greetings,
>>> Stephan
>>>
>>>
>>> On Wed, Dec 3, 2014 at 3:52 PM, Stephan Ewen <se...@apache.org> wrote:
>>>>
>>>> Hi!
>>>>
>>>> You can always use the "rich" version of the function, for example the
>>>> "RichMapFunction". Inside that function, you can call "getRuntimeContext()",
>>>> which gives you access to many things, among them the partition number.
>>>>
>>>> Stephan
>>>>
>>>>
>>>> On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Dear Developers,
>>>>>
>>>>> Datasets are partitioned between machines. I wonder if there is a way
>>>>> to get some identifier of a partition. I see that the class
>>>>> HashPartition has a getPartitionNumber method, but I don't see how I
>>>>> could use this.
>>>>> (For example, I would like to see the partition identifier in a
>>>>> MapFunction, or in a MapPartitionFunction).
>>>>>
>>>>> Attila
>>>>
>>>>
>>>
Re: partition identifier
Posted by Aljoscha Krettek <al...@apache.org>.
RuntimeContext.getIndexOfThisSubtask()
What do you want to use this partition number for? If I may ask.
Cheers,
Aljoscha
On Wed, Dec 3, 2014 at 4:12 PM, Attila Bernáth <be...@gmail.com> wrote:
> Thank you, Stephan.
> How to access the partition number from the RuntimeContext?
>
> Attila
>
> 2014-12-03 15:53 GMT+01:00 Stephan Ewen <se...@apache.org>:
>> Hey!
>>
>> Here is a brief description how to use rich functions:
>> http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#passing-functions-to-flink
>>
>> Greetings,
>> Stephan
>>
>>
>> On Wed, Dec 3, 2014 at 3:52 PM, Stephan Ewen <se...@apache.org> wrote:
>>>
>>> Hi!
>>>
>>> You can always use the "rich" version of the function, for example the
>>> "RichMapFunction". Inside that function, you can call "getRuntimeContext()",
>>> which gives you access to many things, among them the partition number.
>>>
>>> Stephan
>>>
>>>
>>> On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
>>> wrote:
>>>>
>>>> Dear Developers,
>>>>
>>>> Datasets are partitioned between machines. I wonder if there is a way
>>>> to get some identifier of a partition. I see that the class
>>>> HashPartition has a getPartitionNumber method, but I don't see how I
>>>> could use this.
>>>> (For example, I would like to see the partition identifier in a
>>>> MapFunction, or in a MapPartitionFunction).
>>>>
>>>> Attila
>>>
>>>
>>
Re: partition identifier
Posted by Attila Bernáth <be...@gmail.com>.
Thank you, Stephan.
How to access the partition number from the RuntimeContext?
Attila
2014-12-03 15:53 GMT+01:00 Stephan Ewen <se...@apache.org>:
> Hey!
>
> Here is a brief description how to use rich functions:
> http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#passing-functions-to-flink
>
> Greetings,
> Stephan
>
>
> On Wed, Dec 3, 2014 at 3:52 PM, Stephan Ewen <se...@apache.org> wrote:
>>
>> Hi!
>>
>> You can always use the "rich" version of the function, for example the
>> "RichMapFunction". Inside that function, you can call "getRuntimeContext()",
>> which gives you access to many things, among them the partition number.
>>
>> Stephan
>>
>>
>> On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
>> wrote:
>>>
>>> Dear Developers,
>>>
>>> Datasets are partitioned between machines. I wonder if there is a way
>>> to get some identifier of a partition. I see that the class
>>> HashPartition has a getPartitionNumber method, but I don't see how I
>>> could use this.
>>> (For example, I would like to see the partition identifier in a
>>> MapFunction, or in a MapPartitionFunction).
>>>
>>> Attila
>>
>>
>
Re: partition identifier
Posted by Stephan Ewen <se...@apache.org>.
Hey!
Here is a brief description how to use rich functions:
http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#passing-functions-to-flink
Greetings,
Stephan
On Wed, Dec 3, 2014 at 3:52 PM, Stephan Ewen <se...@apache.org> wrote:
> Hi!
>
> You can always use the "rich" version of the function, for example the
> "RichMapFunction". Inside that function, you can call
> "getRuntimeContext()", which gives you access to many things, among them
> the partition number.
>
> Stephan
>
>
> On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
> wrote:
>
>> Dear Developers,
>>
>> Datasets are partitioned between machines. I wonder if there is a way
>> to get some identifier of a partition. I see that the class
>> HashPartition has a getPartitionNumber method, but I don't see how I
>> could use this.
>> (For example, I would like to see the partition identifier in a
>> MapFunction, or in a MapPartitionFunction).
>>
>> Attila
>>
>
>
Re: partition identifier
Posted by Stephan Ewen <se...@apache.org>.
Hi!
You can always use the "rich" version of the function, for example the
"RichMapFunction". Inside that function, you can call
"getRuntimeContext()", which gives you access to many things, among them
the partition number.
Stephan
On Wed, Dec 3, 2014 at 3:49 PM, Attila Bernáth <be...@gmail.com>
wrote:
> Dear Developers,
>
> Datasets are partitioned between machines. I wonder if there is a way
> to get some identifier of a partition. I see that the class
> HashPartition has a getPartitionNumber method, but I don't see how I
> could use this.
> (For example, I would like to see the partition identifier in a
> MapFunction, or in a MapPartitionFunction).
>
> Attila
>