You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@chukwa.apache.org by Eric Yang <ey...@yahoo-inc.com> on 2009/12/18 02:12:28 UTC

PipelineStageWriter doesn't work as expected

Hi all,

I'd setup SocketTeeWriter by itself, and having data stream to next socket
reader program.  When I tried to configure two writers, i.e., SeqFileWriter
follow by SocketTeeWriter.  It doesn't work because SeqFileWriter isn't
extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
PipelineableWriter and do that and implemented setNextStage method, and
configured collector with:

  <property>
    <name>chukwaCollector.writerClass</name>
    
<value>org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
alue>
  </property>

  <property>
    <name>chukwaCollector.pipeline</name>
    
<value>org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
  </property>

SeqFileWriter writes the data correctly, but when connect to
SocketTeeWriter, there was no data visible in SocketTeeWriter.  Commands
works fine, but data streaming doesn't happen.  How do I configure the
collector and PipelineStageWriter to be able to write data into multiple
writer?  Is there something on SeqFileWriter that could prevent this from
working?

Regards,
Eric

Re: PipelineStageWriter doesn't work as expected

Posted by Ariel Rabkin <as...@gmail.com>.

I would do this by making PipelineableWriter into an abstract class,
rather than an interface.  CHUKWA-432 opened for this.  Jerome, is
that what you had in mind?

On Fri, Dec 18, 2009 at 12:39 PM, Jerome Boulon <jb...@netflix.com> wrote:
> Hi Eric,
> Can you create another class that takes a writer and make it a pipeline
> writer? The logic for pipeline should be extracted and the current writers
> should be kept clean.
>
> I'm saying that because I have a new writer implementation and I would have
> to do something similar to what you're doing for near real time monitoring.
>
> Thanks,
>  /Jerome.
>
> On 12/18/09 9:16 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
>
>> Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
>> post data processing because the postProcess program crashed.  I still need
>> to determine the cause of postProcess crash.  I think the modified
>> SeqFileWriter does what I wanted, and I will implement next.add() to ensure
>> the ordering can be interchanged.
>>
>> Regards,
>> Eric
>>
>> On 12/18/09 8:59 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
>>
>>> I like to make a T on the incoming data.  One writer goes into HDFS, and
>>> another writer enable real time pub/sub to monitor the data.  In my case,
>>> the data are mirrored, not filtered.  However, I am not getting the right
>>> result because it seems the data isn't getting written into HDFS regardless
>>> the ordering of the writer.
>>>
>>> Regards,
>>> Eric
>>>
>>> On 12/17/09 9:53 PM, "Ariel Rabkin" <as...@gmail.com> wrote:
>>>
>>>> What's the use case for this?
>>>>
>>>> The original motivation for pipelined writers was so that we could do
>>>> things like filtering before data got written.  Then it occurred to me
>>>> that SocketTeeWriter fit fairly naturally into a pipeline.
>>>>
>>>> Putting it "after" seq file writer wouldn't be too bad --
>>>> SeqFileWriter.add() would need to call next.add().  But I would be
>>>> hesitant to commit that change, without a really clear use case.
>>>>
>>>> --Ari
>>>>
>>>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>>>> implemented in SeqFileWriter to be able to pipe correctly?
>>>>>
>>>>> Regards,
>>>>> Eric
>>>>>
>>>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>>>>>
>>>>>> Put the SocketTeeWriter first.
>>>>>>
>>>>>> sent from my iPhone; please excuse typos and brevity.
>>>>>>
>>>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>>>> socket
>>>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>>>> SeqFileWriter
>>>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>>>> isn't
>>>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>>>> and
>>>>>>> configured collector with:
>>>>>>>
>>>>>>>  <property>
>>>>>>>    <name>chukwaCollector.writerClass</name>
>>>>>>>
>>>>>>> <value>
>>>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>>>> alue>
>>>>>>>  </property>
>>>>>>>
>>>>>>>  <property>
>>>>>>>    <name>chukwaCollector.pipeline</name>
>>>>>>>
>>>>>>> <value>
>>>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>>>  </property>
>>>>>>>
>>>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>>>> Commands
>>>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>>>> collector and PipelineStageWriter to be able to write data into
>>>>>>> multiple
>>>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>>>> from
>>>>>>> working?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Eric
>>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Re: PipelineStageWriter doesn't work as expected

Posted by Eric Yang <ey...@yahoo-inc.com>.

I agree that the current writer should be kept as it is.  My SeqFileWriter
will be renamed to PipelineSeqFileWriter.  I also like the idea of abstract
class to reduce duplicated coding on the writer implementation.
+1 on 432.

Regards,
Eric

On 12/18/09 9:39 AM, "Jerome Boulon" <jb...@netflix.com> wrote:

> Hi Eric,
> Can you create another class that takes a writer and make it a pipeline
> writer? The logic for pipeline should be extracted and the current writers
> should be kept clean.
> 
> I'm saying that because I have a new writer implementation and I would have
> to do something similar to what you're doing for near real time monitoring.
> 
> Thanks,
>   /Jerome.
> 
> On 12/18/09 9:16 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
> 
>> Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
>> post data processing because the postProcess program crashed.  I still need
>> to determine the cause of postProcess crash.  I think the modified
>> SeqFileWriter does what I wanted, and I will implement next.add() to ensure
>> the ordering can be interchanged.
>> 
>> Regards,
>> Eric
>> 
>> On 12/18/09 8:59 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
>> 
>>> I like to make a T on the incoming data.  One writer goes into HDFS, and
>>> another writer enable real time pub/sub to monitor the data.  In my case,
>>> the data are mirrored, not filtered.  However, I am not getting the right
>>> result because it seems the data isn't getting written into HDFS regardless
>>> the ordering of the writer.
>>> 
>>> Regards,
>>> Eric
>>> 
>>> On 12/17/09 9:53 PM, "Ariel Rabkin" <as...@gmail.com> wrote:
>>> 
>>>> What's the use case for this?
>>>> 
>>>> The original motivation for pipelined writers was so that we could do
>>>> things like filtering before data got written.  Then it occurred to me
>>>> that SocketTeeWriter fit fairly naturally into a pipeline.
>>>> 
>>>> Putting it "after" seq file writer wouldn't be too bad --
>>>> SeqFileWriter.add() would need to call next.add().  But I would be
>>>> hesitant to commit that change, without a really clear use case.
>>>> 
>>>> --Ari
>>>> 
>>>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>>>> implemented in SeqFileWriter to be able to pipe correctly?
>>>>> 
>>>>> Regards,
>>>>> Eric
>>>>> 
>>>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>>>>> 
>>>>>> Put the SocketTeeWriter first.
>>>>>> 
>>>>>> sent from my iPhone; please excuse typos and brevity.
>>>>>> 
>>>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>>>> socket
>>>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>>>> SeqFileWriter
>>>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>>>> isn't
>>>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>>>> and
>>>>>>> configured collector with:
>>>>>>> 
>>>>>>>  <property>
>>>>>>>    <name>chukwaCollector.writerClass</name>
>>>>>>> 
>>>>>>> <value>
>>>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>>>> alue>
>>>>>>>  </property>
>>>>>>> 
>>>>>>>  <property>
>>>>>>>    <name>chukwaCollector.pipeline</name>
>>>>>>> 
>>>>>>> <value>
>>>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>>>  </property>
>>>>>>> 
>>>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>>>> Commands
>>>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>>>> collector and PipelineStageWriter to be able to write data into
>>>>>>> multiple
>>>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>>>> from
>>>>>>> working?
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Eric
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>

Re: PipelineStageWriter doesn't work as expected

Posted by Jerome Boulon <jb...@netflix.com>.

Hi Eric,
Can you create another class that takes a writer and make it a pipeline
writer? The logic for pipeline should be extracted and the current writers
should be kept clean.

I'm saying that because I have a new writer implementation and I would have
to do something similar to what you're doing for near real time monitoring.

Thanks,
  /Jerome.

On 12/18/09 9:16 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:

> Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
> post data processing because the postProcess program crashed.  I still need
> to determine the cause of postProcess crash.  I think the modified
> SeqFileWriter does what I wanted, and I will implement next.add() to ensure
> the ordering can be interchanged.
> 
> Regards,
> Eric
> 
> On 12/18/09 8:59 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
> 
>> I like to make a T on the incoming data.  One writer goes into HDFS, and
>> another writer enable real time pub/sub to monitor the data.  In my case,
>> the data are mirrored, not filtered.  However, I am not getting the right
>> result because it seems the data isn't getting written into HDFS regardless
>> the ordering of the writer.
>> 
>> Regards,
>> Eric
>> 
>> On 12/17/09 9:53 PM, "Ariel Rabkin" <as...@gmail.com> wrote:
>> 
>>> What's the use case for this?
>>> 
>>> The original motivation for pipelined writers was so that we could do
>>> things like filtering before data got written.  Then it occurred to me
>>> that SocketTeeWriter fit fairly naturally into a pipeline.
>>> 
>>> Putting it "after" seq file writer wouldn't be too bad --
>>> SeqFileWriter.add() would need to call next.add().  But I would be
>>> hesitant to commit that change, without a really clear use case.
>>> 
>>> --Ari
>>> 
>>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>>> implemented in SeqFileWriter to be able to pipe correctly?
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>>>> 
>>>>> Put the SocketTeeWriter first.
>>>>> 
>>>>> sent from my iPhone; please excuse typos and brevity.
>>>>> 
>>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>>> socket
>>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>>> SeqFileWriter
>>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>>> isn't
>>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>>> and
>>>>>> configured collector with:
>>>>>> 
>>>>>>  <property>
>>>>>>    <name>chukwaCollector.writerClass</name>
>>>>>> 
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>>> alue>
>>>>>>  </property>
>>>>>> 
>>>>>>  <property>
>>>>>>    <name>chukwaCollector.pipeline</name>
>>>>>> 
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>>  </property>
>>>>>> 
>>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>>> Commands
>>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>>> collector and PipelineStageWriter to be able to write data into
>>>>>> multiple
>>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>>> from
>>>>>> working?
>>>>>> 
>>>>>> Regards,
>>>>>> Eric
>>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>

Re: PipelineStageWriter doesn't work as expected

Posted by Eric Yang <ey...@yahoo-inc.com>.

Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
post data processing because the postProcess program crashed.  I still need
to determine the cause of postProcess crash.  I think the modified
SeqFileWriter does what I wanted, and I will implement next.add() to ensure
the ordering can be interchanged.

Regards,
Eric

On 12/18/09 8:59 AM, "Eric Yang" <ey...@yahoo-inc.com> wrote:

> I like to make a T on the incoming data.  One writer goes into HDFS, and
> another writer enable real time pub/sub to monitor the data.  In my case,
> the data are mirrored, not filtered.  However, I am not getting the right
> result because it seems the data isn't getting written into HDFS regardless
> the ordering of the writer.
> 
> Regards,
> Eric
> 
> On 12/17/09 9:53 PM, "Ariel Rabkin" <as...@gmail.com> wrote:
> 
>> What's the use case for this?
>> 
>> The original motivation for pipelined writers was so that we could do
>> things like filtering before data got written.  Then it occurred to me
>> that SocketTeeWriter fit fairly naturally into a pipeline.
>> 
>> Putting it "after" seq file writer wouldn't be too bad --
>> SeqFileWriter.add() would need to call next.add().  But I would be
>> hesitant to commit that change, without a really clear use case.
>> 
>> --Ari
>> 
>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>> implemented in SeqFileWriter to be able to pipe correctly?
>>> 
>>> Regards,
>>> Eric
>>> 
>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>>> 
>>>> Put the SocketTeeWriter first.
>>>> 
>>>> sent from my iPhone; please excuse typos and brevity.
>>>> 
>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>> socket
>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>> SeqFileWriter
>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>> isn't
>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>> and
>>>>> configured collector with:
>>>>> 
>>>>>  <property>
>>>>>    <name>chukwaCollector.writerClass</name>
>>>>> 
>>>>> <value>
>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>> alue>
>>>>>  </property>
>>>>> 
>>>>>  <property>
>>>>>    <name>chukwaCollector.pipeline</name>
>>>>> 
>>>>> <value>
>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>  </property>
>>>>> 
>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>> Commands
>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>> collector and PipelineStageWriter to be able to write data into
>>>>> multiple
>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>> from
>>>>> working?
>>>>> 
>>>>> Regards,
>>>>> Eric
>>>>> 
>>> 
>>> 
>> 
>> 
>

Re: PipelineStageWriter doesn't work as expected

Posted by Eric Yang <ey...@yahoo-inc.com>.

I like to make a T on the incoming data.  One writer goes into HDFS, and
another writer enable real time pub/sub to monitor the data.  In my case,
the data are mirrored, not filtered.  However, I am not getting the right
result because it seems the data isn't getting written into HDFS regardless
the ordering of the writer.

Regards,
Eric

On 12/17/09 9:53 PM, "Ariel Rabkin" <as...@gmail.com> wrote:

> What's the use case for this?
> 
> The original motivation for pipelined writers was so that we could do
> things like filtering before data got written.  Then it occurred to me
> that SocketTeeWriter fit fairly naturally into a pipeline.
> 
> Putting it "after" seq file writer wouldn't be too bad --
> SeqFileWriter.add() would need to call next.add().  But I would be
> hesitant to commit that change, without a really clear use case.
> 
> --Ari
> 
> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>> It works fine after I put SocketTeeWriter first.  What needs to be
>> implemented in SeqFileWriter to be able to pipe correctly?
>> 
>> Regards,
>> Eric
>> 
>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>> 
>>> Put the SocketTeeWriter first.
>>> 
>>> sent from my iPhone; please excuse typos and brevity.
>>> 
>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>> socket
>>>> reader program.  When I tried to configure two writers, i.e.,
>>>> SeqFileWriter
>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>> isn't
>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>> and
>>>> configured collector with:
>>>> 
>>>>  <property>
>>>>    <name>chukwaCollector.writerClass</name>
>>>> 
>>>> <value>
>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>> alue>
>>>>  </property>
>>>> 
>>>>  <property>
>>>>    <name>chukwaCollector.pipeline</name>
>>>> 
>>>> <value>
>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>  </property>
>>>> 
>>>> SeqFileWriter writes the data correctly, but when connect to
>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>> Commands
>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>> collector and PipelineStageWriter to be able to write data into
>>>> multiple
>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>> from
>>>> working?
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>> 
>> 
> 
>

Re: PipelineStageWriter doesn't work as expected

Posted by Ariel Rabkin <as...@gmail.com>.

What's the use case for this?

The original motivation for pipelined writers was so that we could do
things like filtering before data got written.  Then it occurred to me
that SocketTeeWriter fit fairly naturally into a pipeline.

Putting it "after" seq file writer wouldn't be too bad --
SeqFileWriter.add() would need to call next.add().  But I would be
hesitant to commit that change, without a really clear use case.

--Ari

On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
> It works fine after I put SocketTeeWriter first.  What needs to be
> implemented in SeqFileWriter to be able to pipe correctly?
>
> Regards,
> Eric
>
> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:
>
>> Put the SocketTeeWriter first.
>>
>> sent from my iPhone; please excuse typos and brevity.
>>
>> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
>>
>>> Hi all,
>>>
>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>> socket
>>> reader program.  When I tried to configure two writers, i.e.,
>>> SeqFileWriter
>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>> isn't
>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>> PipelineableWriter and do that and implemented setNextStage method,
>>> and
>>> configured collector with:
>>>
>>>  <property>
>>>    <name>chukwaCollector.writerClass</name>
>>>
>>> <value>
>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>> alue>
>>>  </property>
>>>
>>>  <property>
>>>    <name>chukwaCollector.pipeline</name>
>>>
>>> <value>
>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>  </property>
>>>
>>> SeqFileWriter writes the data correctly, but when connect to
>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>> Commands
>>> works fine, but data streaming doesn't happen.  How do I configure the
>>> collector and PipelineStageWriter to be able to write data into
>>> multiple
>>> writer?  Is there something on SeqFileWriter that could prevent this
>>> from
>>> working?
>>>
>>> Regards,
>>> Eric
>>>
>
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Re: PipelineStageWriter doesn't work as expected

Posted by Eric Yang <ey...@yahoo-inc.com>.

It works fine after I put SocketTeeWriter first.  What needs to be
implemented in SeqFileWriter to be able to pipe correctly?

Regards,
Eric

On 12/17/09 5:26 PM, "asrabkin@gmail.com" <as...@gmail.com> wrote:

> Put the SocketTeeWriter first.
> 
> sent from my iPhone; please excuse typos and brevity.
> 
> On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:
> 
>> Hi all,
>> 
>> I'd setup SocketTeeWriter by itself, and having data stream to next
>> socket
>> reader program.  When I tried to configure two writers, i.e.,
>> SeqFileWriter
>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>> isn't
>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>> PipelineableWriter and do that and implemented setNextStage method,
>> and
>> configured collector with:
>> 
>>  <property>
>>    <name>chukwaCollector.writerClass</name>
>> 
>> <value> 
>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>> alue>
>>  </property>
>> 
>>  <property>
>>    <name>chukwaCollector.pipeline</name>
>> 
>> <value> 
>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>  </property>
>> 
>> SeqFileWriter writes the data correctly, but when connect to
>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>> Commands
>> works fine, but data streaming doesn't happen.  How do I configure the
>> collector and PipelineStageWriter to be able to write data into
>> multiple
>> writer?  Is there something on SeqFileWriter that could prevent this
>> from
>> working?
>> 
>> Regards,
>> Eric
>>

Re: PipelineStageWriter doesn't work as expected

Posted by as...@gmail.com.

Put the SocketTeeWriter first.

sent from my iPhone; please excuse typos and brevity.

On Dec 17, 2009, at 8:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:

> Hi all,
>
> I'd setup SocketTeeWriter by itself, and having data stream to next  
> socket
> reader program.  When I tried to configure two writers, i.e.,  
> SeqFileWriter
> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter  
> isn't
> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
> PipelineableWriter and do that and implemented setNextStage method,  
> and
> configured collector with:
>
>  <property>
>    <name>chukwaCollector.writerClass</name>
>
> <value> 
> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
> alue>
>  </property>
>
>  <property>
>    <name>chukwaCollector.pipeline</name>
>
> <value> 
> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>  </property>
>
> SeqFileWriter writes the data correctly, but when connect to
> SocketTeeWriter, there was no data visible in SocketTeeWriter.   
> Commands
> works fine, but data streaming doesn't happen.  How do I configure the
> collector and PipelineStageWriter to be able to write data into  
> multiple
> writer?  Is there something on SeqFileWriter that could prevent this  
> from
> working?
>
> Regards,
> Eric
>