You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by Jean-Baptiste Onofré <jb...@nanthrax.net> on 2015/02/03 15:51:38 UTC
Falcon with Spark ?
Hi all,
I'm working (and finally resuming my work ;)) on some Falcon features:
- Update and improvements on the ActiveMQ broker
- Complete CDC support of diff/gap storage
- Support of more workflow entities (mapreduce directly instead of Oozie
workflow.xml, etc)
For the workflow entities, I would like to evaluate the "direct" support
of Spark.
Generally speaking, I wonder if "oppositionally", we couldn't leverage
Spark for some internal Falcon processes (like eviction, etc).
WDYT ?
Regards
JB
--
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com
Re: Falcon with Spark ?
Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Yes, it's what I saw.
For the heavy dependency, that's why I defined "optional": the user has
to specify it and use a specific configuration (knowing what he does).
For the OOZIE, I agree but again it requires a workflow.xml for Oozie.
My plan is to avoid for the users to provide a workflow.xml, and instead
be able to use the process configuration to define the job to run
(directly MapReduce, spark, etc).
Regards
JB
On 02/03/2015 06:30 PM, Srikanth Sundarrajan wrote:
> Yes support for Spark was specifically added through https://issues.apache.org/jira/browse/OOZIE-1983, to allow users of Oozie or Falcon to run Spark jobs. Moving the retention job to Spark would create a heavy dependency on spark within Falcon.
>
> With https://issues.apache.org/jira/browse/FALCON-965, it should be possible to create an alternate implementation of eviction.
>
> Regards
> Srikanth Sundarrajan
>
>> Date: Tue, 3 Feb 2015 15:51:38 +0100
>> From: jb@nanthrax.net
>> To: dev@falcon.incubator.apache.org
>> Subject: Falcon with Spark ?
>>
>> Hi all,
>>
>> I'm working (and finally resuming my work ;)) on some Falcon features:
>> - Update and improvements on the ActiveMQ broker
>> - Complete CDC support of diff/gap storage
>> - Support of more workflow entities (mapreduce directly instead of Oozie
>> workflow.xml, etc)
>>
>> For the workflow entities, I would like to evaluate the "direct" support
>> of Spark.
>> Generally speaking, I wonder if "oppositionally", we couldn't leverage
>> Spark for some internal Falcon processes (like eviction, etc).
>>
>> WDYT ?
>>
>> Regards
>> JB
>> --
>> Jean-Baptiste Onofré
>> jbonofre@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>
>
--
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com
RE: Falcon with Spark ?
Posted by Srikanth Sundarrajan <sr...@hotmail.com>.
Yes support for Spark was specifically added through https://issues.apache.org/jira/browse/OOZIE-1983, to allow users of Oozie or Falcon to run Spark jobs. Moving the retention job to Spark would create a heavy dependency on spark within Falcon.
With https://issues.apache.org/jira/browse/FALCON-965, it should be possible to create an alternate implementation of eviction.
Regards
Srikanth Sundarrajan
> Date: Tue, 3 Feb 2015 15:51:38 +0100
> From: jb@nanthrax.net
> To: dev@falcon.incubator.apache.org
> Subject: Falcon with Spark ?
>
> Hi all,
>
> I'm working (and finally resuming my work ;)) on some Falcon features:
> - Update and improvements on the ActiveMQ broker
> - Complete CDC support of diff/gap storage
> - Support of more workflow entities (mapreduce directly instead of Oozie
> workflow.xml, etc)
>
> For the workflow entities, I would like to evaluate the "direct" support
> of Spark.
> Generally speaking, I wonder if "oppositionally", we couldn't leverage
> Spark for some internal Falcon processes (like eviction, etc).
>
> WDYT ?
>
> Regards
> JB
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com