You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Husky Zeng <56...@qq.com> on 2020/09/18 07:11:32 UTC

Is there a way to avoid submit hive-udf's resources when we submit a job?

When we submit a job which use udf of hive , the job will dependent on udf's
jars and configuration files.

We have already store udf's jars and configuration files in hive metadata
store,so we excpet that flink could get those files hdfs paths by
hive-connector,and get those files in hdfs by paths when it running.

In this code, it seemed we have already get those udf resources's path in
FunctionInfo, but did't use it.

  
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80

We submit udf's  jars and configuration files with job to yarn by client now
,and try to find a way to avoid submit udf's resources when we submit a
job.Is it possible?  



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Is there a way to avoid submit hive-udf's resources when we submit a job?

Posted by Rui Li <li...@apache.org>.
Hi Timo,

I believe the blocker for this feature is that we don't support dynamically
adding user jars/resources at the moment. We're able to read the path to
the function jar from Hive metastore, but we cannot load the jar after the
user session is started.

On Tue, Sep 22, 2020 at 3:43 PM Timo Walther <tw...@apache.org> wrote:

> Hi Husky,
>
> I guess https://issues.apache.org/jira/browse/FLINK-14055 is what is
> needed to make this feature possible.
>
> @Rui: Do you know more about this issue and current limitations.
>
> Regards,
> Timo
>
>
> On 18.09.20 09:11, Husky Zeng wrote:
> > When we submit a job which use udf of hive , the job will dependent on
> udf's
> > jars and configuration files.
> >
> > We have already store udf's jars and configuration files in hive metadata
> > store,so we excpet that flink could get those files hdfs paths by
> > hive-connector,and get those files in hdfs by paths when it running.
> >
> > In this code, it seemed we have already get those udf resources's path in
> > FunctionInfo, but did't use it.
> >
> >
> >
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80
> >
> > We submit udf's  jars and configuration files with job to yarn by client
> now
> > ,and try to find a way to avoid submit udf's resources when we submit a
> > job.Is it possible?
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> >
>
>

-- 
Cheers,
Rui Li

Re: Is there a way to avoid submit hive-udf's resources when we submit a job?

Posted by Husky Zeng <56...@qq.com>.
Hi Timo,

Thanks for your attention,As what I say in this comment, this feature can
surely solve our problem, but it seems that the workload is much larger than
the solution in my scenario. Our project urgently needs to solve the problem
of reusing hive UDF in hive metastore, so we are more inclined to develop a
fast solution. I want to hear some community advice.

https://issues.apache.org/jira/browse/FLINK-19335?focusedCommentId=17199927&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17199927

Best Regards,
Husky Zeng



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Is there a way to avoid submit hive-udf's resources when we submit a job?

Posted by Timo Walther <tw...@apache.org>.
Hi Husky,

I guess https://issues.apache.org/jira/browse/FLINK-14055 is what is 
needed to make this feature possible.

@Rui: Do you know more about this issue and current limitations.

Regards,
Timo


On 18.09.20 09:11, Husky Zeng wrote:
> When we submit a job which use udf of hive , the job will dependent on udf's
> jars and configuration files.
> 
> We have already store udf's jars and configuration files in hive metadata
> store,so we excpet that flink could get those files hdfs paths by
> hive-connector,and get those files in hdfs by paths when it running.
> 
> In this code, it seemed we have already get those udf resources's path in
> FunctionInfo, but did't use it.
> 
>    
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80
> 
> We submit udf's  jars and configuration files with job to yarn by client now
> ,and try to find a way to avoid submit udf's resources when we submit a
> job.Is it possible?
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>