You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Eric Yang (JIRA)" <ji...@apache.org> on 2013/01/05 19:42:12 UTC

[jira] [Commented] (CHUKWA-678) Make use of ChukwaWriter in agent

    [ https://issues.apache.org/jira/browse/CHUKWA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544779#comment-13544779 ] 

Eric Yang commented on CHUKWA-678:
----------------------------------

List of sending events should be a ring buffer.  If the sending action fails multiple retries, then it should discard that data.  The error entry can be logged for failure by recursively inject into the buffer ring or logged locally.  For handling pipeline failure, our general rule of thumb is to throw exception back to agent when one of the write failed to commit.  If one or more of the writers have failed in the writing action, we throw exception.  There chunk will be retried, and this means multiple data sink can receive duplicated data.  We have the unique sequence number in our meta data, therefore the de-dupe can happen synchronously (in writer) or asynchronously (off band process in map reduce).  We provide a single result of the commit status from pipeline writer, instead of sending List of results back to agent.  This will make sure retries and de-dupe logic can implemented correctly.
                
> Make use of ChukwaWriter in agent
> ---------------------------------
>
>                 Key: CHUKWA-678
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-678
>             Project: Chukwa
>          Issue Type: Sub-task
>          Components: Data Collection
>         Environment: MacOSX, Java 6
>            Reporter: shreyas subramanya
>
> The chukwa agent sends out data chunks to various destinations through the combination of Connector and ChukwaSender interfaces. For sending chunks to collector, we have http implementation of these interfaces. The collector writes out the received chunks to various destinations through classes implementing ChukwaWriter interface. Optionally, multiple destinations can be chosen by specifying PipelineStageWriter.
> The proposal is to:
> 1. Use ChukwaWriter to send out data chunks to multiple destinations from the agent. Further, PipelinestageWriter can be made default and pipeline configuration specified in the agent config file
> 2. Implement (or modify) Pipelineable writers for HBase, Http, Hdfs and WebHdfs
> 3. Do away with the Connector interface and have a single non configurable connector object as part of the agent. This class initiates the configured writer, waits for data chunks and passes the chunks to Writer.add()/send(). Connection protocol for each destination is handled by the init() of the individual writers.
> Considerations:
> 1. In case of Pipelineable writers, we need a way to merge the results of each pipeline stage before the agent commits the chunk.
> 2. Handling pipeline failure

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira