You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Fabio <an...@gmail.com> on 2015/01/23 09:41:28 UTC

Reliability of timestamps in logs

Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as 
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how reliable is the timestamp of the 
log entries.
Does log4j run on an independent thread? If yes, could it be the reason 
why some log entries appear as misplaced? Or are they supposed to be in 
strict execution order in any case?
I ask this because I am validating a project and I need to be pretty 
sure about what happens when.

Thanks

Fabio

Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Thanks for the diagrams tip! I will try it.
(SLS is the Scheduler Load Simulator 
http://hadoop.apache.org/docs/r2.6.0/hadoop-sls/SchedulerLoadSimulator.html 
)

Regards

Fabio

On 01/27/2015 11:04 PM, Ravi Prakash wrote:
> I'm afraid I don't know what the "SLS" is. Obviously it shouldn't 
> matter if it runs on the same node. I don't think hadoop code ever 
> updates the system clock. In fact it shouldn't even be run with the 
> perms to do so.
> It depends on log4j appenders whether they buffer and batch the 
> messages before writing to disk. I would think that the timestamp 
> would still be the time the messages were received (rather than the 
> time they were flushed to disk)
>
> I am not sure if you know but you can get beautiful state diagrams of 
> the different agents by running $mvn -Pvisualize and then using dot to 
> convert the *.gv files to png . I'd found that helped me a lot
>
>
> On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
>
>
> Yes I am, does it make a difference? SLS runs on a single machine, 
> wrapping the RM and simulating the nodes, thus it should use just the 
> system time.
> Or do you mean there is a chance it's updating the clock while the job 
> is running?
>
> Regards
>
> Fabio
>
> On 01/26/2015 08:00 PM, Ravi Prakash wrote:
>> Are you running NTP?
>>
>>
>> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> 
>> <ma...@gmail.com> wrote:
>>
>>
>> Hi guys,
>> while analyzing SLS logs I noticed some unexpected behaviors, such as
>> resources requests sent before the AM container gets to a RUNNING state.
>> For this reason I started wondering how reliable is the timestamp of the
>> log entries.
>> Does log4j run on an independent thread? If yes, could it be the reason
>> why some log entries appear as misplaced? Or are they supposed to be in
>> strict execution order in any case?
>> I ask this because I am validating a project and I need to be pretty
>> sure about what happens when.
>>
>> Thanks
>>
>> Fabio
>>
>>
>
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Thanks for the diagrams tip! I will try it.
(SLS is the Scheduler Load Simulator 
http://hadoop.apache.org/docs/r2.6.0/hadoop-sls/SchedulerLoadSimulator.html 
)

Regards

Fabio

On 01/27/2015 11:04 PM, Ravi Prakash wrote:
> I'm afraid I don't know what the "SLS" is. Obviously it shouldn't 
> matter if it runs on the same node. I don't think hadoop code ever 
> updates the system clock. In fact it shouldn't even be run with the 
> perms to do so.
> It depends on log4j appenders whether they buffer and batch the 
> messages before writing to disk. I would think that the timestamp 
> would still be the time the messages were received (rather than the 
> time they were flushed to disk)
>
> I am not sure if you know but you can get beautiful state diagrams of 
> the different agents by running $mvn -Pvisualize and then using dot to 
> convert the *.gv files to png . I'd found that helped me a lot
>
>
> On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
>
>
> Yes I am, does it make a difference? SLS runs on a single machine, 
> wrapping the RM and simulating the nodes, thus it should use just the 
> system time.
> Or do you mean there is a chance it's updating the clock while the job 
> is running?
>
> Regards
>
> Fabio
>
> On 01/26/2015 08:00 PM, Ravi Prakash wrote:
>> Are you running NTP?
>>
>>
>> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> 
>> <ma...@gmail.com> wrote:
>>
>>
>> Hi guys,
>> while analyzing SLS logs I noticed some unexpected behaviors, such as
>> resources requests sent before the AM container gets to a RUNNING state.
>> For this reason I started wondering how reliable is the timestamp of the
>> log entries.
>> Does log4j run on an independent thread? If yes, could it be the reason
>> why some log entries appear as misplaced? Or are they supposed to be in
>> strict execution order in any case?
>> I ask this because I am validating a project and I need to be pretty
>> sure about what happens when.
>>
>> Thanks
>>
>> Fabio
>>
>>
>
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Thanks for the diagrams tip! I will try it.
(SLS is the Scheduler Load Simulator 
http://hadoop.apache.org/docs/r2.6.0/hadoop-sls/SchedulerLoadSimulator.html 
)

Regards

Fabio

On 01/27/2015 11:04 PM, Ravi Prakash wrote:
> I'm afraid I don't know what the "SLS" is. Obviously it shouldn't 
> matter if it runs on the same node. I don't think hadoop code ever 
> updates the system clock. In fact it shouldn't even be run with the 
> perms to do so.
> It depends on log4j appenders whether they buffer and batch the 
> messages before writing to disk. I would think that the timestamp 
> would still be the time the messages were received (rather than the 
> time they were flushed to disk)
>
> I am not sure if you know but you can get beautiful state diagrams of 
> the different agents by running $mvn -Pvisualize and then using dot to 
> convert the *.gv files to png . I'd found that helped me a lot
>
>
> On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
>
>
> Yes I am, does it make a difference? SLS runs on a single machine, 
> wrapping the RM and simulating the nodes, thus it should use just the 
> system time.
> Or do you mean there is a chance it's updating the clock while the job 
> is running?
>
> Regards
>
> Fabio
>
> On 01/26/2015 08:00 PM, Ravi Prakash wrote:
>> Are you running NTP?
>>
>>
>> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> 
>> <ma...@gmail.com> wrote:
>>
>>
>> Hi guys,
>> while analyzing SLS logs I noticed some unexpected behaviors, such as
>> resources requests sent before the AM container gets to a RUNNING state.
>> For this reason I started wondering how reliable is the timestamp of the
>> log entries.
>> Does log4j run on an independent thread? If yes, could it be the reason
>> why some log entries appear as misplaced? Or are they supposed to be in
>> strict execution order in any case?
>> I ask this because I am validating a project and I need to be pretty
>> sure about what happens when.
>>
>> Thanks
>>
>> Fabio
>>
>>
>
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Thanks for the diagrams tip! I will try it.
(SLS is the Scheduler Load Simulator 
http://hadoop.apache.org/docs/r2.6.0/hadoop-sls/SchedulerLoadSimulator.html 
)

Regards

Fabio

On 01/27/2015 11:04 PM, Ravi Prakash wrote:
> I'm afraid I don't know what the "SLS" is. Obviously it shouldn't 
> matter if it runs on the same node. I don't think hadoop code ever 
> updates the system clock. In fact it shouldn't even be run with the 
> perms to do so.
> It depends on log4j appenders whether they buffer and batch the 
> messages before writing to disk. I would think that the timestamp 
> would still be the time the messages were received (rather than the 
> time they were flushed to disk)
>
> I am not sure if you know but you can get beautiful state diagrams of 
> the different agents by running $mvn -Pvisualize and then using dot to 
> convert the *.gv files to png . I'd found that helped me a lot
>
>
> On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
>
>
> Yes I am, does it make a difference? SLS runs on a single machine, 
> wrapping the RM and simulating the nodes, thus it should use just the 
> system time.
> Or do you mean there is a chance it's updating the clock while the job 
> is running?
>
> Regards
>
> Fabio
>
> On 01/26/2015 08:00 PM, Ravi Prakash wrote:
>> Are you running NTP?
>>
>>
>> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> 
>> <ma...@gmail.com> wrote:
>>
>>
>> Hi guys,
>> while analyzing SLS logs I noticed some unexpected behaviors, such as
>> resources requests sent before the AM container gets to a RUNNING state.
>> For this reason I started wondering how reliable is the timestamp of the
>> log entries.
>> Does log4j run on an independent thread? If yes, could it be the reason
>> why some log entries appear as misplaced? Or are they supposed to be in
>> strict execution order in any case?
>> I ask this because I am validating a project and I need to be pretty
>> sure about what happens when.
>>
>> Thanks
>>
>> Fabio
>>
>>
>
>
>


Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
I'm afraid I don't know what the "SLS" is. Obviously it shouldn't matter if it runs on the same node. I don't think hadoop code ever updates the system clock. In fact it shouldn't even be run with the perms to do so.It depends on log4j appenders whether they buffer and batch the messages before writing to disk. I would think that the timestamp would still be the time the messages were received (rather than the time they were flushed to disk)
I am not sure if you know but you can get beautiful state diagrams of the different agents by running $mvn -Pvisualize and then using dot to convert the *.gv files to png . I'd found that helped me a lot


     On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
   

  Yes I am, does it make a difference? SLS runs on a single machine, wrapping the RM and simulating the nodes, thus it should use just the system time. 
 Or do you mean there is a chance it's updating the clock while the job is running?
 
 Regards
 
 Fabio
 
 On 01/26/2015 08:00 PM, Ravi Prakash wrote:
  
 Are you running NTP?
  
 
       On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   
 
 Hi guys,
 while analyzing SLS logs I noticed some unexpected behaviors, such as 
 resources requests sent before the AM container gets to a RUNNING state.
 For this reason I started wondering how reliable is the timestamp of the 
 log entries.
 Does log4j run on an independent thread? If yes, could it be the reason 
 why some log entries appear as misplaced? Or are they supposed to be in 
 strict execution order in any case?
 I ask this because I am validating a project and I need to be pretty 
 sure about what happens when.
 
 Thanks
 
 Fabio
 
 
      
 
 

    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
I'm afraid I don't know what the "SLS" is. Obviously it shouldn't matter if it runs on the same node. I don't think hadoop code ever updates the system clock. In fact it shouldn't even be run with the perms to do so.It depends on log4j appenders whether they buffer and batch the messages before writing to disk. I would think that the timestamp would still be the time the messages were received (rather than the time they were flushed to disk)
I am not sure if you know but you can get beautiful state diagrams of the different agents by running $mvn -Pvisualize and then using dot to convert the *.gv files to png . I'd found that helped me a lot


     On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
   

  Yes I am, does it make a difference? SLS runs on a single machine, wrapping the RM and simulating the nodes, thus it should use just the system time. 
 Or do you mean there is a chance it's updating the clock while the job is running?
 
 Regards
 
 Fabio
 
 On 01/26/2015 08:00 PM, Ravi Prakash wrote:
  
 Are you running NTP?
  
 
       On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   
 
 Hi guys,
 while analyzing SLS logs I noticed some unexpected behaviors, such as 
 resources requests sent before the AM container gets to a RUNNING state.
 For this reason I started wondering how reliable is the timestamp of the 
 log entries.
 Does log4j run on an independent thread? If yes, could it be the reason 
 why some log entries appear as misplaced? Or are they supposed to be in 
 strict execution order in any case?
 I ask this because I am validating a project and I need to be pretty 
 sure about what happens when.
 
 Thanks
 
 Fabio
 
 
      
 
 

    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
I'm afraid I don't know what the "SLS" is. Obviously it shouldn't matter if it runs on the same node. I don't think hadoop code ever updates the system clock. In fact it shouldn't even be run with the perms to do so.It depends on log4j appenders whether they buffer and batch the messages before writing to disk. I would think that the timestamp would still be the time the messages were received (rather than the time they were flushed to disk)
I am not sure if you know but you can get beautiful state diagrams of the different agents by running $mvn -Pvisualize and then using dot to convert the *.gv files to png . I'd found that helped me a lot


     On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
   

  Yes I am, does it make a difference? SLS runs on a single machine, wrapping the RM and simulating the nodes, thus it should use just the system time. 
 Or do you mean there is a chance it's updating the clock while the job is running?
 
 Regards
 
 Fabio
 
 On 01/26/2015 08:00 PM, Ravi Prakash wrote:
  
 Are you running NTP?
  
 
       On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   
 
 Hi guys,
 while analyzing SLS logs I noticed some unexpected behaviors, such as 
 resources requests sent before the AM container gets to a RUNNING state.
 For this reason I started wondering how reliable is the timestamp of the 
 log entries.
 Does log4j run on an independent thread? If yes, could it be the reason 
 why some log entries appear as misplaced? Or are they supposed to be in 
 strict execution order in any case?
 I ask this because I am validating a project and I need to be pretty 
 sure about what happens when.
 
 Thanks
 
 Fabio
 
 
      
 
 

    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
I'm afraid I don't know what the "SLS" is. Obviously it shouldn't matter if it runs on the same node. I don't think hadoop code ever updates the system clock. In fact it shouldn't even be run with the perms to do so.It depends on log4j appenders whether they buffer and batch the messages before writing to disk. I would think that the timestamp would still be the time the messages were received (rather than the time they were flushed to disk)
I am not sure if you know but you can get beautiful state diagrams of the different agents by running $mvn -Pvisualize and then using dot to convert the *.gv files to png . I'd found that helped me a lot


     On Monday, January 26, 2015 11:20 PM, Fabio <an...@gmail.com> wrote:
   

  Yes I am, does it make a difference? SLS runs on a single machine, wrapping the RM and simulating the nodes, thus it should use just the system time. 
 Or do you mean there is a chance it's updating the clock while the job is running?
 
 Regards
 
 Fabio
 
 On 01/26/2015 08:00 PM, Ravi Prakash wrote:
  
 Are you running NTP?
  
 
       On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   
 
 Hi guys,
 while analyzing SLS logs I noticed some unexpected behaviors, such as 
 resources requests sent before the AM container gets to a RUNNING state.
 For this reason I started wondering how reliable is the timestamp of the 
 log entries.
 Does log4j run on an independent thread? If yes, could it be the reason 
 why some log entries appear as misplaced? Or are they supposed to be in 
 strict execution order in any case?
 I ask this because I am validating a project and I need to be pretty 
 sure about what happens when.
 
 Thanks
 
 Fabio
 
 
      
 
 

    

Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Yes I am, does it make a difference? SLS runs on a single machine, 
wrapping the RM and simulating the nodes, thus it should use just the 
system time.
Or do you mean there is a chance it's updating the clock while the job 
is running?

Regards

Fabio

On 01/26/2015 08:00 PM, Ravi Prakash wrote:
> Are you running NTP?
>
>
> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
>
>
> Hi guys,
> while analyzing SLS logs I noticed some unexpected behaviors, such as
> resources requests sent before the AM container gets to a RUNNING state.
> For this reason I started wondering how reliable is the timestamp of the
> log entries.
> Does log4j run on an independent thread? If yes, could it be the reason
> why some log entries appear as misplaced? Or are they supposed to be in
> strict execution order in any case?
> I ask this because I am validating a project and I need to be pretty
> sure about what happens when.
>
> Thanks
>
> Fabio
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Yes I am, does it make a difference? SLS runs on a single machine, 
wrapping the RM and simulating the nodes, thus it should use just the 
system time.
Or do you mean there is a chance it's updating the clock while the job 
is running?

Regards

Fabio

On 01/26/2015 08:00 PM, Ravi Prakash wrote:
> Are you running NTP?
>
>
> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
>
>
> Hi guys,
> while analyzing SLS logs I noticed some unexpected behaviors, such as
> resources requests sent before the AM container gets to a RUNNING state.
> For this reason I started wondering how reliable is the timestamp of the
> log entries.
> Does log4j run on an independent thread? If yes, could it be the reason
> why some log entries appear as misplaced? Or are they supposed to be in
> strict execution order in any case?
> I ask this because I am validating a project and I need to be pretty
> sure about what happens when.
>
> Thanks
>
> Fabio
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Yes I am, does it make a difference? SLS runs on a single machine, 
wrapping the RM and simulating the nodes, thus it should use just the 
system time.
Or do you mean there is a chance it's updating the clock while the job 
is running?

Regards

Fabio

On 01/26/2015 08:00 PM, Ravi Prakash wrote:
> Are you running NTP?
>
>
> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
>
>
> Hi guys,
> while analyzing SLS logs I noticed some unexpected behaviors, such as
> resources requests sent before the AM container gets to a RUNNING state.
> For this reason I started wondering how reliable is the timestamp of the
> log entries.
> Does log4j run on an independent thread? If yes, could it be the reason
> why some log entries appear as misplaced? Or are they supposed to be in
> strict execution order in any case?
> I ask this because I am validating a project and I need to be pretty
> sure about what happens when.
>
> Thanks
>
> Fabio
>
>


Re: Reliability of timestamps in logs

Posted by Fabio <an...@gmail.com>.
Yes I am, does it make a difference? SLS runs on a single machine, 
wrapping the RM and simulating the nodes, thus it should use just the 
system time.
Or do you mean there is a chance it's updating the clock while the job 
is running?

Regards

Fabio

On 01/26/2015 08:00 PM, Ravi Prakash wrote:
> Are you running NTP?
>
>
> On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
>
>
> Hi guys,
> while analyzing SLS logs I noticed some unexpected behaviors, such as
> resources requests sent before the AM container gets to a RUNNING state.
> For this reason I started wondering how reliable is the timestamp of the
> log entries.
> Does log4j run on an independent thread? If yes, could it be the reason
> why some log entries appear as misplaced? Or are they supposed to be in
> strict execution order in any case?
> I ask this because I am validating a project and I need to be pretty
> sure about what happens when.
>
> Thanks
>
> Fabio
>
>


Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
Are you running NTP?
 

     On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   

 Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as 
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how reliable is the timestamp of the 
log entries.
Does log4j run on an independent thread? If yes, could it be the reason 
why some log entries appear as misplaced? Or are they supposed to be in 
strict execution order in any case?
I ask this because I am validating a project and I need to be pretty 
sure about what happens when.

Thanks

Fabio


    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
Are you running NTP?
 

     On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   

 Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as 
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how reliable is the timestamp of the 
log entries.
Does log4j run on an independent thread? If yes, could it be the reason 
why some log entries appear as misplaced? Or are they supposed to be in 
strict execution order in any case?
I ask this because I am validating a project and I need to be pretty 
sure about what happens when.

Thanks

Fabio


    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
Are you running NTP?
 

     On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   

 Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as 
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how reliable is the timestamp of the 
log entries.
Does log4j run on an independent thread? If yes, could it be the reason 
why some log entries appear as misplaced? Or are they supposed to be in 
strict execution order in any case?
I ask this because I am validating a project and I need to be pretty 
sure about what happens when.

Thanks

Fabio


    

Re: Reliability of timestamps in logs

Posted by Ravi Prakash <ra...@ymail.com>.
Are you running NTP?
 

     On Friday, January 23, 2015 12:42 AM, Fabio <an...@gmail.com> wrote:
   

 Hi guys,
while analyzing SLS logs I noticed some unexpected behaviors, such as 
resources requests sent before the AM container gets to a RUNNING state.
For this reason I started wondering how reliable is the timestamp of the 
log entries.
Does log4j run on an independent thread? If yes, could it be the reason 
why some log entries appear as misplaced? Or are they supposed to be in 
strict execution order in any case?
I ask this because I am validating a project and I need to be pretty 
sure about what happens when.

Thanks

Fabio