You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2013/11/14 21:53:21 UTC
[jira] [Comment Edited] (CONNECTORS-781) Fault-Tolerant Setup for ManifoldCF Agent.

    [ https://issues.apache.org/jira/browse/CONNECTORS-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822892#comment-13822892 ] 

Karl Wright edited comment on CONNECTORS-781 at 11/14/13 8:52 PM:
------------------------------------------------------------------

Hi Graeme,

What we want is the following:

- repository connections and output connections are locally pooled, because setup costs are considerable in some cases
- the sum total of all open Repository or Output connections across all cluster members never exceeds the respective counts given by the user; the count may indeed be license-limited

You can perhaps now see the issue.  If cluster member A needs output connection handle X, but there aren't any available locally in the local pool, it has to make sure there are
enough available instances (presumably by requesting that other pool members shut down their free instances) before creating a new one of its own.  Now, free instances *will* occur on other cluster members over time, but only because idle connections expire - after something like 5 minutes.  That's not going to fly probably.

It's probably better to allocate connections up front.  But even then, what do you do when there's more cluster members than there are connections allowed?



was (Author: kwright@metacarta.com):
Hi Graeme,

What we want is the following:

- repository connections and output connections are locally pooled, because setup costs are considerable in some cases
- the sum total of all open Repository or Output connections across all cluster members never exceeds the respective counts given by the user; the count may indeed be license-limited

You can perhaps now see the issue.  If cluster member A needs output connection handle X, but there aren't any available locally in the local pool, it has to make sure there are
enough available instances (presumably by requesting that other pool members shut down their free instances) before creating a new one of its own.  Now, free instances *will* occur on other cluster members over time, but only because idle connections expire - after something like 5 minutes.  That's not going to fly probably.




> Fault-Tolerant Setup for ManifoldCF Agent.
> ------------------------------------------
>
>                 Key: CONNECTORS-781
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-781
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Framework agents process, Framework core, Framework crawler agent
>    Affects Versions: ManifoldCF 1.5
>            Reporter: Swami Rajamohan
>            Assignee: Karl Wright
>              Labels: agents, crawler, fault-tolerance
>             Fix For: ManifoldCF 1.5
>
>
> It should be possible to setup ManifoldCF as a Fault-Tolerant infrastructure.
> The Agent component of ManifoldCF should support multiple instances of an agent crawling against a single crawl store, to be able to both distribute (share) the crawl load as well as to be able to pick up a request that gets abruptly terminated due to either partitioning of the instance/failure of the instance itself.
> Since there is a proposal to move to a store like Voldemort, it would be nice to be able to have a fault tolerant infrastructure.



--
This message was sent by Atlassian JIRA
(v6.1#6144)