You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Sachin Mittal <sj...@gmail.com> on 2021/03/23 05:28:54 UTC

Where does pulsar functions jar is copied over on the broker

Hi,
I am creating multiple functions onto a pulsar cluster on Kubernetes.
Essentially all the functions use the same jar and function class but
configured differently wrt to input/output topics.

So from the Java Admin client I send the api call to create a function with
different names multiple times. I think based on what I could see in the
implementation the Jar contents are serialized as part of form data and
sent over the network.

I want to know on the broker side where is this jar stored?

If I have many functions then it looks like the same jar will be copied
multiple times at different locations (or that won't be the case) ?

Is there a way to avoid sending Jar contents each time I create a function
and just refer to the Jar containing the function class. I suppose for this
that Jar needs to be made available to the broker, if yes how?

Thanks
Sachin

Re: Where does pulsar functions jar is copied over on the broker

Posted by Sijie Guo <gu...@gmail.com>.
On Tue, Mar 23, 2021 at 12:21 AM Sachin Mittal <sj...@gmail.com> wrote:

> Hi,
> So when you say "you can put the function as a built-in function in the
> built-in functions directory"
> you mean that I need to place the jar in this directory, correct?
>

Yes. That's correct.


>
> Also right now I am using version 2.5.0 and I don't really see this config
> in functions_worker.yml.
> Is this feature supported in that version and if yes then which dir I can
> place my jars?
>

You need to upgrade to 2.7.x in order to use this feature.


>
> Also how do I put that jar in the built-in directory when the cluster is
> deployed on Kubernetes and say I have a cluster of three brokers running.
> Do I need to place the jar in each of the brokers built-in directory? If
> yes how? Is there any api available to do the same?
>

You can either a) built a new image based on pulsar image and include the
functions there; or b) mount an external persistent volume and put your jar
in the external persistent volume.

- Sijie


>
> Thanks
> Sachin
>
>
> On Tue, Mar 23, 2021 at 12:11 PM Sijie Guo <gu...@gmail.com> wrote:
>
>> The jar file will be stored in bookkeeper. When functions are executed,
>> the jar will be downloaded by each function worker (broker) in a download
>> directory that is configured at
>> https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L41
>> .
>>
>> If you want to avoid uploading and downloading the function jar again and
>> again, you can put the function as a built-in function in the built-in
>> functions directory in all the function workers (brokers) that is
>> configured at
>> https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L300
>>
>> Thanks,
>> Sijie
>>
>> On Mon, Mar 22, 2021 at 10:29 PM Sachin Mittal <sj...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I am creating multiple functions onto a pulsar cluster on Kubernetes.
>>> Essentially all the functions use the same jar and function class but
>>> configured differently wrt to input/output topics.
>>>
>>> So from the Java Admin client I send the api call to create a function
>>> with different names multiple times. I think based on what I could see in
>>> the implementation the Jar contents are serialized as part of form data and
>>> sent over the network.
>>>
>>> I want to know on the broker side where is this jar stored?
>>>
>>> If I have many functions then it looks like the same jar will be copied
>>> multiple times at different locations (or that won't be the case) ?
>>>
>>> Is there a way to avoid sending Jar contents each time I create a
>>> function and just refer to the Jar containing the function class. I suppose
>>> for this that Jar needs to be made available to the broker, if yes how?
>>>
>>> Thanks
>>> Sachin
>>>
>>>

Re: Where does pulsar functions jar is copied over on the broker

Posted by Sachin Mittal <sj...@gmail.com>.
Hi,
So when you say "you can put the function as a built-in function in the
built-in functions directory"
you mean that I need to place the jar in this directory, correct?

Also right now I am using version 2.5.0 and I don't really see this config
in functions_worker.yml.
Is this feature supported in that version and if yes then which dir I can
place my jars?

Also how do I put that jar in the built-in directory when the cluster is
deployed on Kubernetes and say I have a cluster of three brokers running.
Do I need to place the jar in each of the brokers built-in directory? If
yes how? Is there any api available to do the same?

Thanks
Sachin


On Tue, Mar 23, 2021 at 12:11 PM Sijie Guo <gu...@gmail.com> wrote:

> The jar file will be stored in bookkeeper. When functions are executed,
> the jar will be downloaded by each function worker (broker) in a download
> directory that is configured at
> https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L41
> .
>
> If you want to avoid uploading and downloading the function jar again and
> again, you can put the function as a built-in function in the built-in
> functions directory in all the function workers (brokers) that is
> configured at
> https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L300
>
> Thanks,
> Sijie
>
> On Mon, Mar 22, 2021 at 10:29 PM Sachin Mittal <sj...@gmail.com> wrote:
>
>> Hi,
>> I am creating multiple functions onto a pulsar cluster on Kubernetes.
>> Essentially all the functions use the same jar and function class but
>> configured differently wrt to input/output topics.
>>
>> So from the Java Admin client I send the api call to create a function
>> with different names multiple times. I think based on what I could see in
>> the implementation the Jar contents are serialized as part of form data and
>> sent over the network.
>>
>> I want to know on the broker side where is this jar stored?
>>
>> If I have many functions then it looks like the same jar will be copied
>> multiple times at different locations (or that won't be the case) ?
>>
>> Is there a way to avoid sending Jar contents each time I create a
>> function and just refer to the Jar containing the function class. I suppose
>> for this that Jar needs to be made available to the broker, if yes how?
>>
>> Thanks
>> Sachin
>>
>>

Re: Where does pulsar functions jar is copied over on the broker

Posted by Sijie Guo <gu...@gmail.com>.
The jar file will be stored in bookkeeper. When functions are executed, the
jar will be downloaded by each function worker (broker) in a download
directory that is configured at
https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L41.

If you want to avoid uploading and downloading the function jar again and
again, you can put the function as a built-in function in the built-in
functions directory in all the function workers (brokers) that is
configured at
https://github.com/apache/pulsar/blob/master/conf/functions_worker.yml#L300

Thanks,
Sijie

On Mon, Mar 22, 2021 at 10:29 PM Sachin Mittal <sj...@gmail.com> wrote:

> Hi,
> I am creating multiple functions onto a pulsar cluster on Kubernetes.
> Essentially all the functions use the same jar and function class but
> configured differently wrt to input/output topics.
>
> So from the Java Admin client I send the api call to create a function
> with different names multiple times. I think based on what I could see in
> the implementation the Jar contents are serialized as part of form data and
> sent over the network.
>
> I want to know on the broker side where is this jar stored?
>
> If I have many functions then it looks like the same jar will be copied
> multiple times at different locations (or that won't be the case) ?
>
> Is there a way to avoid sending Jar contents each time I create a function
> and just refer to the Jar containing the function class. I suppose for this
> that Jar needs to be made available to the broker, if yes how?
>
> Thanks
> Sachin
>
>