You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by Shweta Shah <sh...@gmail.com> on 2011/05/31 20:33:54 UTC

Closing .chukwa files in collector at regular fixed offset interval to avoid time interval drift

Hi,

We have a system built on top of Chukwa that introduces the need to have all
sequence files for a given rotateInterval closed and available at a
predictable time.  Currently, we are experiencing some drift in the times
that the sequence files are closed due to the way the TimerTask is scheduled
in the SeqFileWriter class.  We would like to submit a solution that will
allow people to configure the time all collectors should close their files
for processing in a given interval, while still supporting the default
functionality.  We have written the code and are currently testing the
functionality, but would like to know if anyone has any feedback on this
development before submitting a JIRA ticket.

- Shweta

Re: Closing .chukwa files in collector at regular fixed offset interval to avoid time interval drift

Posted by Bill Graham <bi...@gmail.com>.
I work with Shweta and this is something we've been considering for a while.
The idea is that collectors could be configured to close their files at some
fixed offset after a the end of the interval. In our case, we'd like files
to close at 30 seconds after the end of a 5 minute period. This will help us
with a couple of edge cases that the current approach causes for us.
Backward compatibility and default behavior would of course be maintained.


On Tue, May 31, 2011 at 2:12 PM, Eric Yang <ey...@yahoo-inc.com> wrote:

> +1 for JIRA, nice to have feature.
>
> Regards,
> Eric
>
> On 5/31/11 2:01 PM, "Ariel Rabkin" <as...@gmail.com> wrote:
>
> Also.
>
> This sounds like something fairly site-specific. It would be really
> good to have unit tests so we don't break this
> feature in the future.
>
> --Ari
>
> On Tue, May 31, 2011 at 2:01 PM, Ariel Rabkin <as...@gmail.com> wrote:
> > Not quite sure I understand the use case, but if it seems useful,
> > sure, open a JIRA.
> >
> > --Ari
> >
> > On Tue, May 31, 2011 at 11:33 AM, Shweta Shah <sh...@gmail.com>
> wrote:
> >> Hi,
> >>
> >> We have a system built on top of Chukwa that introduces the need to have
> all
> >> sequence files for a given rotateInterval closed and available at a
> >> predictable time.  Currently, we are experiencing some drift in the
> times
> >> that the sequence files are closed due to the way the TimerTask is
> scheduled
> >> in the SeqFileWriter class.  We would like to submit a solution that
> will
> >> allow people to configure the time all collectors should close their
> files
> >> for processing in a given interval, while still supporting the default
> >> functionality.  We have written the code and are currently testing the
> >> functionality, but would like to know if anyone has any feedback on this
> >> development before submitting a JIRA ticket.
> >>
> >> - Shweta
> >>
> >
> >
> >
> > --
> > Ari Rabkin asrabkin@gmail.com
> > UC Berkeley Computer Science Department
> >
>
>
>
> --
> Ari Rabkin asrabkin@gmail.com
> UC Berkeley Computer Science Department
>
>

Re: Closing .chukwa files in collector at regular fixed offset interval to avoid time interval drift

Posted by Eric Yang <ey...@yahoo-inc.com>.
+1 for JIRA, nice to have feature.

Regards,
Eric

On 5/31/11 2:01 PM, "Ariel Rabkin" <as...@gmail.com> wrote:

Also.

This sounds like something fairly site-specific. It would be really
good to have unit tests so we don't break this
feature in the future.

--Ari

On Tue, May 31, 2011 at 2:01 PM, Ariel Rabkin <as...@gmail.com> wrote:
> Not quite sure I understand the use case, but if it seems useful,
> sure, open a JIRA.
>
> --Ari
>
> On Tue, May 31, 2011 at 11:33 AM, Shweta Shah <sh...@gmail.com> wrote:
>> Hi,
>>
>> We have a system built on top of Chukwa that introduces the need to have all
>> sequence files for a given rotateInterval closed and available at a
>> predictable time.  Currently, we are experiencing some drift in the times
>> that the sequence files are closed due to the way the TimerTask is scheduled
>> in the SeqFileWriter class.  We would like to submit a solution that will
>> allow people to configure the time all collectors should close their files
>> for processing in a given interval, while still supporting the default
>> functionality.  We have written the code and are currently testing the
>> functionality, but would like to know if anyone has any feedback on this
>> development before submitting a JIRA ticket.
>>
>> - Shweta
>>
>
>
>
> --
> Ari Rabkin asrabkin@gmail.com
> UC Berkeley Computer Science Department
>



--
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department


Re: Closing .chukwa files in collector at regular fixed offset interval to avoid time interval drift

Posted by Ariel Rabkin <as...@gmail.com>.
Also.

This sounds like something fairly site-specific. It would be really
good to have unit tests so we don't break this
feature in the future.

--Ari

On Tue, May 31, 2011 at 2:01 PM, Ariel Rabkin <as...@gmail.com> wrote:
> Not quite sure I understand the use case, but if it seems useful,
> sure, open a JIRA.
>
> --Ari
>
> On Tue, May 31, 2011 at 11:33 AM, Shweta Shah <sh...@gmail.com> wrote:
>> Hi,
>>
>> We have a system built on top of Chukwa that introduces the need to have all
>> sequence files for a given rotateInterval closed and available at a
>> predictable time.  Currently, we are experiencing some drift in the times
>> that the sequence files are closed due to the way the TimerTask is scheduled
>> in the SeqFileWriter class.  We would like to submit a solution that will
>> allow people to configure the time all collectors should close their files
>> for processing in a given interval, while still supporting the default
>> functionality.  We have written the code and are currently testing the
>> functionality, but would like to know if anyone has any feedback on this
>> development before submitting a JIRA ticket.
>>
>> - Shweta
>>
>
>
>
> --
> Ari Rabkin asrabkin@gmail.com
> UC Berkeley Computer Science Department
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Re: Closing .chukwa files in collector at regular fixed offset interval to avoid time interval drift

Posted by Ariel Rabkin <as...@gmail.com>.
Not quite sure I understand the use case, but if it seems useful,
sure, open a JIRA.

--Ari

On Tue, May 31, 2011 at 11:33 AM, Shweta Shah <sh...@gmail.com> wrote:
> Hi,
>
> We have a system built on top of Chukwa that introduces the need to have all
> sequence files for a given rotateInterval closed and available at a
> predictable time.  Currently, we are experiencing some drift in the times
> that the sequence files are closed due to the way the TimerTask is scheduled
> in the SeqFileWriter class.  We would like to submit a solution that will
> allow people to configure the time all collectors should close their files
> for processing in a given interval, while still supporting the default
> functionality.  We have written the code and are currently testing the
> functionality, but would like to know if anyone has any feedback on this
> development before submitting a JIRA ticket.
>
> - Shweta
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department