You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Antonio Martínez Carratalá <am...@alto-analytics.com> on 2020/03/02 07:32:57 UTC

Re: Flink remote batch execution in dynamic cluster

Thank you Piotrek, I will check those options, I only have a standalone
cluster so any option would need a set up.

On Fri, Feb 28, 2020 at 2:12 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> I guess it depends what do you have already available in your cluster and
> try to use that. Running Flink in existing Yarn cluster is very easy, but
> setting up yarn cluster in the first place even if it’s easy (I’m not sure
> about if that’s the case), would add extra complexity.
>
> When I’m spawning an AWS cluster for testing, I’m using EMR with Yarn
> included and I think that’s very easy to do, as everything works out of the
> box. I’ve heard that Kubernetes/Docker are just as easy. I’m also not a dev
> ops, but I’ve heard that my colleagues, if have any preferences, they
> usually prefer Kubernetes.
>
> Have in mind that I need to run the job with
> ExecutionEnvironment.createRemoteEnvironment(), to upload a jar is not a
> valid option for me, it seems to me that not all the options support remote
> submission of jobs, but I'm not sure
>
>
> I think all of them support should support remote environment. Almost for
> sure Standalone, Yarn, Kubernetes and Docker do.
>
> Piotrek
>
> On 28 Feb 2020, at 10:25, Antonio Martínez Carratalá <
> amartinez@alto-analytics.com> wrote:
>
> Hello
>
> I'm working on a project with Flink 1.8. I'm running my code from Java in
> a remote Flink as described here
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/cluster_execution.html
> . That part is working, but I want to configure a dynamic Flink cluster to
> execute the jobs
>
> Imagine I have users that sometimes need to run a report, this report is
> generated with data processed in Flink, whenever a user requests a report I
> have to submit a job to a remote Flink cluster, this job execution is heavy
> and may require 1 hour to finish
>
> So, I don't want to have 3, 4, 5... Task Managers always running in the
> cluster, some times they are idle and other times I don't have enough Task
> Managers for all the requests, I want to dynamically create Task Managers
> as the jobs are received at the Job Manager, and get rid of them at the end
>
> I see a lot of options to create a cluster in
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ section
> [Deployment & Operations] [Clusters & Deployment] like Standalone, YARN,
> Mesos, Docker, Kubernetes... but I don't know what would be the most
> suitable for my case of use, I'm not an expert in devops and I barely know
> about these technologies
>
> Some advice on which technology to use, and maybe some examples, would be
> really appreciated
>
> Have in mind that I need to run the job with
> ExecutionEnvironment.createRemoteEnvironment(), to upload a jar is not a
> valid option for me, it seems to me that not all the options support remote
> submission of jobs, but I'm not sure
>
> Thank you
>
> Antonio Martinez
>
>
>

-- 

----------------------------------------------------------------------------------------------------------

*Alto Social Analytics, S.L., tratará tus datos con la finalidad de
mantener la relación contractual, gestionar tu solicitud, así como enviarte
comunicaciones comerciales relacionadas con su ámbito de actividad y sus
servicios. Puedes oponerte a este tratamiento, así como ejercitar el resto
de derechos de acceso, rectificación o supresión, limitación de su
tratamiento, portabilidad, en nuestro domicilio social y en el correo
electrónico: dpo@alto-analytics.com <dp...@alto-analytics.com>. Más
información en www.alto-analytics.com <http://www.alto-analytics.com/>. La
información contenida en este correo es confidencial y para uso exclusivo
de la persona que la reciba. Si no eres la persona correcta o has recibido
esta comunicación por error, te rogamos que nos lo notifiques y lo
elimines, dado que puede contener información sujeta a secreto empresarial
o propiedad intelectual de terceros.*


*Alto Social Analytics, S.L., will process your data for the purpose of
maintaining the contractual relationship, managing your request, as well as
sending you commercial communications related to its field of activity and
services. You can oppose this processing, as well as exercise the rest of
rights of access, rectification or deletion, limitation of processing,
portability, in our registered office and in our
email: dpo@alto-analytics.com <dp...@alto-analytics.com>. More information
at www.alto-analytics.com <http://www.alto-analytics.com/>. The information
contained in this email is confidential and for the exclusive use of the
person who receives it. If you have received this communication by mistake,
we ask you to notify us and delete it, since it may contain information
subject to business secrecy or intellectual property of third parties.*