You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Neeraj Vaidya <ne...@yahoo.co.in> on 2017/02/23 00:23:52 UTC

Real-time computing

Hi,

I have a use case where I need to perform computation on records in files (specifically files containing telecom CDRs).

To this, I have a few questions :

1) Should I have just one client node which reads these records and creates Callable compute jobs for each record ? With just 1 client node, I suppose this will be a single-point of failure. I could use Zookeeper to manage a cluster of such nodes, thus possibly mitigating an SPOF.

2) Or should I stream/load these records using a client, into a cache and then have other cluster nodes read this cache for new entries and then let them perform the computation ? In this case, is there a way by which I can have only one node get hold of computing every record ?

Regards,
Neeraj

Re: Real-time computing

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi Neeraj,

1. Why you want to use Zookeeper to mitigating an SPOF instead of Ignite
ComputeGrid failover features?

2. If you need to reuse data then caching makes sense. For processing new
entries you can use Events or Continuous queries.
You are free in choosing number of nodes for your grid. You can choose what
nodes will hold data and what nodes will be used for computations.

I'm not sure I understand last question. Would you please detail the last
use case?

On Thu, Feb 23, 2017 at 3:23 AM, Neeraj Vaidya <ne...@yahoo.co.in>
wrote:

> Hi,
>
> I have a use case where I need to perform computation on records in files
> (specifically files containing telecom CDRs).
>
> To this, I have a few questions :
>
> 1) Should I have just one client node which reads these records and
> creates Callable compute jobs for each record ? With just 1 client node, I
> suppose this will be a single-point of failure. I could use Zookeeper to
> manage a cluster of such nodes, thus possibly mitigating an SPOF.
>
> 2) Or should I stream/load these records using a client, into a cache and
> then have other cluster nodes read this cache for new entries and then let
> them perform the computation ? In this case, is there a way by which I can
> have only one node get hold of computing every record ?
>
> Regards,
> Neeraj
>

-- 
Best regards,
Andrey V. Mashenkov