You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2010/08/31 12:01:00 UTC

[jira] Commented: (CONNECTORS-41) Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.

    [ https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904582#action_12904582 ] 

Karl Wright commented on CONNECTORS-41:
---------------------------------------

I looked at this in some detail yesterday.  The prime implementation option is to add notification methods to IOutputConnector, so that job events get reported to the connector when the job is being terminated.  The issue in this case is going to be how exactly to handle ServiceInterruption exceptions that occur at the time of the notification into the connector.  This is not hypothetical because in the Solr case a notification may well fail, or it may take a very long time (many minutes).  Usually when there is a possibility of extended interaction it argues for an additional state in the database.

It looks like it will not be possible to delay the change of the job status, since that takes place in a transaction.  If the notification fails, the job could otherwise be left in the "running" state, and a retry would naturally occur until the commit succeeded.  But that doesn't look possible given the transaction structure.

An alternative (non-notification) method of handling a commit request would require the commit to take place as part of the output connector's poll() method.  This is a little better to work with because the poll() method will naturally retry in any case.  The issue here is that there would be no *guarantee* of a commit taking place at all, since it isn't part of the connector contract that the connection must continue to exist for any period of time, which I think would violate the spirit of this ticket.

If explicit notification takes place, we could just report any error, and forget about it, rather than keeping the job alive for a retry.  That, too, would mean that a commit was not guaranteed to occur during the job's lifecycle.

The final alternative, which would seemingly work, would involve there being two job shutdown states - one prior to notification, and the second after notification.  The first state would be entered based on the current shutdown logic.  The second state would be entered only after the notification had been successful.  Thus, the notification *could* be called more than once, if there were errors, or if the crawler were shut down and restarted before the state transition was completed.  The extra state would also allow the job's pre-notification status to be noted in the crawler ui.

Because of the potential time delay of a commit, it is probably best for the first to second shutdown state transition to be handled by a separate thread, or family of threads.


> Add hooks to output connectors for receiving event notifications, specifically job start, job end, etc.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-41
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
>             Project: Apache Connectors Framework
>          Issue Type: Improvement
>          Components: Framework core
>            Reporter: Karl Wright
>            Priority: Minor
>
> Currently there is no logic that informs an output connection of a job start, end, deletion, or other activity.  While this would seem to have little to do with an output connector, this feature has been requested by Jack Krupansky as a potential way of deciding when to tell Solr to commit documents, rather than leave it up to Solr's configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.