You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Stuart Smith <st...@yahoo.com> on 2010/04/22 22:25:28 UTC
Re: hadoop log timestamps & file timestamps not same as system time [SOLVED]

Ok. Solved the timestamp issue.

The best solution I found was:

export TZ=`cat /etc/timezone`
(which I added to .bashrc)

I found it here:
http://davidwinterbourne.com/?p=37

With the common explanation that it should be symbolic link or java scans /usr/share/zoneinfo looking for a match using some weird logic.

However, I'm not entirely comfortable with this explanation: 
I had two machines with the exact same configuration and the exact same files in /usr/share/zoneinfo, but with one reporting PST in java, and the other, PDT. So there could well be some other weirdness here.

Hopefully this solves the original issue (which was hbase regionservers continually being re-added to the master with a cycle of MSG_CALL_SERVER_STARTUP messages to the same machine). 
Oddly, hbase seemed to work fine, it just had a huge and ever growing list of regionservers stored the master, which would seem bad long-term.
Not sure whether this (hbase) issue would go in hbase, hadoop, or java, at this point ?

Thanks for the help!

Take care,
  -stu


--- On Thu, 4/22/10, Stuart Smith <st...@yahoo.com> wrote:

> From: Stuart Smith <st...@yahoo.com>
> Subject: Re: hadoop log timestamps & file timestamps not same as system time
> To: hdfs-user@hadoop.apache.org
> Date: Thursday, April 22, 2010, 3:28 PM
> Hello Eric,
> 
>   There's definitely an issue with timezones here
> (yes, deprecated, but to the point):
> 
> import java.util.Date; 
> import java.io.*; 
>  
> class testDateTime 
> { 
>         public static void main(
> String[] argv ) 
>         { 
>                
> Date d = new Date(); 
>                
> System.out.println( "date: " + d.toString() ); 
>         } 
> }
> 
> says:
> 
> stu@ubuntu-namenode:~/testtime$ java testDateTime 
> date: Thu Apr 22 11:37:32 PST 2010
> stu@ubuntu-namenode:~/testtime$ date
> Thu Apr 22 12:37:34 PDT 2010
> stu@ubuntu-namenode:~/testtime$ cat /etc/timezone 
> America/Los_Angeles
> 
> but then says:
> 
> stu@ubuntu-hadoop-1:~/testtime$ java testDateTime
> date: Thu Apr 22 12:38:11 PDT 2010
> stu@ubuntu-hadoop-1:~/testtime$ date
> Thu Apr 22 12:38:12 PDT 2010
> stu@ubuntu-hadoop-1:~/testtime$ cat /etc/timezone 
> America/Los_Angeles
> 
> Still digging into what's messing up java (or what java is
> messing up - every other program seems to be getting it
> right). 
> I've seen a couple posts around about localtime needing to
> be a symbolic link for java, etc, but they're all outdated,
> and not, apparently for ubuntu. 
> 
> I've diff'd the localtime files from both machines, and
> they're the same, so something really odd is going on here.
> 
> But at least you got me heading in the right direction :)
> 
> Hopefully my next post will be the one with the answer..
> 
> Take care,
>   -stu
> 
> 
> 
> --- On Thu, 4/22/10, Eric Sammer <es...@cloudera.com>
> wrote:
> 
> > From: Eric Sammer <es...@cloudera.com>
> > Subject: Re: hadoop log timestamps & file
> timestamps not same as system time
> > To: hdfs-user@hadoop.apache.org,
> stu24mail@yahoo.com
> > Date: Thursday, April 22, 2010, 2:00 AM
> > Java *should* use the machine's
> > configured TZ info (in whatever form
> > that manifests as on the platform in question). For
> Linux,
> > it usually
> > means /etc/localtime and / or possibly the environment
> in
> > which the
> > jvm was executed. If I recall this may be set
> explicitly
> > using some
> > magic (i.e. undocumented) property one can pass to the
> jvm.
> > As for the
> > timestamps themselves, they're generated by log4j
> within
> > the
> > appropriate logging method (.info(), warn(), ...).
> The
> > timezone
> > selected is that of the host jvm within which the app
> is
> > running.
> > 
> > On Thu, Apr 22, 2010 at 1:39 AM,  <st...@yahoo.com>
> > wrote:
> > > Cool thx!
> > >
> > > I already moved ntp to NN (had the same thought).
> All
> > machines sync to the same apt-mirror, but I'll check
> the TZ
> > files to make sure. Not much left to try...
> > > I even formatted NN, and it didn't go away.
> > >
> > > Next would be to dig up hadoop source for
> timestamps,
> > and run it in a harness, I guess. Could be java
> > versioning/TZ issue, I suppose (Is java tzdata
> managed
> > separately?)
> > >
> > > We'll see
> > >
> > > Best,
> > >  -stu
> > > Sent from my Verizon Wireless BlackBerry
> > >
> > > -----Original Message-----
> > > From: Eric Sammer <es...@cloudera.com>
> > > Date: Thu, 22 Apr 2010 01:17:25
> > > To: <hd...@hadoop.apache.org>
> > > Subject: Re: hadoop log timestamps & file
> > timestamps not same as system time
> > >
> > > My initial inclination is to check the timezone
> of the
> > machines. An
> > > hour difference smells like DST shift. Check that
> the
> > OS are the same
> > > version / patch level (as zoneinfo files may
> have
> > changed based on
> > > legislation recently passed). Confirm that all
> > machines are living in
> > > the same zone. Sync'ing to ntp is good too
> although an
> > hour outage
> > > should be fine. If you want to protect against
> ntp
> > drift due to
> > > outage, you can run a canonical ntp server inside
> the
> > cluster. Given
> > > the NN has to be available for the cluster to
> > function, that might be
> > > a good place to run one and just have the DN /
> TT
> > nodes sync to it.
> > >
> > > HTH.
> > >
> > > On Wed, Apr 21, 2010 at 6:46 PM, Stuart Smith
> <st...@yahoo.com>
> > wrote:
> > >> Hello,
> > >>  I've noticed an odd situation:
> > >>
> > >> The timestamps in my hadoop namenode log
> (and
> > filesystem) appear to be an hour behind the actual
> time on
> > the system (as supplied by `date`).
> > >>
> > >> I checked that this wasn't user error by
> stopping
> > and starting hadoop, and checking the most recent
> messages
> > (as well as the last time the file was modified).
> > >>
> > >> For example, you can see the last modified
> time,
> > and the timestamp on the shutdown log message
> different
> > significantly:
> > >>
> > >> stu@ubuntu-namenode:~/hadoop/logs$ ls -l
> > hadoop-stu-namenode-ubuntu-namenode.log
> > >> -rw-r--r-- 1 stu stu 22148124 2010-04-21
> 15:33
> > hadoop-stu-namenode-ubuntu-namenode.log
> > >> stu@ubuntu-namenode:~/hadoop/logs$ tail -f
> > hadoop-stu-namenode-ubuntu-namenode.log
> > >> ....
> > >> 2010-04-21 14:33:19,032 INFO
> > org.apache.hadoop.hdfs.server.namenode.NameNode:
> > SHUTDOWN_MSG:
> > >>
> >
> /************************************************************
> > >> SHUTDOWN_MSG: Shutting down NameNode at
> > ubuntu-namenode/192.168.193.87
> > >>
> >
> ************************************************************/
> > >>
> > >> I also checked this while hadoop was running:
> the
> > messages are consistently an hour behind.
> > >>
> > >> More troubling, this appears affect the
> filesystem
> > timestamps as well.
> > >>
> > >> stu@ubuntu-namenode:~/hadoop/bin$ date
> > >> Wed Apr 21 15:53:34 PDT 2010
> > >> stu@ubuntu-namenode:~/hadoop/bin$ ./hadoop
> dfs
> > -touchz /test
> > >> stu@ubuntu-namenode:~/hadoop/bin$ ./hadoop
> dfs -ls
> > /
> > >> Found 2 items
> > >> drwxr-xr-x   - stu supergroup        
>  0
> > 2010-04-20 16:51 /home
> > >> -rw-r--r--   3 stu supergroup        
>  0
> > 2010-04-21 14:53 /test
> > >>
> > >>
> > >> Interestingly, the datanodes appear to be
> fine:
> > >>
> > >> stu@ubuntu-hadoop-2:~/hadoop/logs$ ls -l
> > hadoop-stu-datanode-ubuntu-hadoop-2.log
> > >> -rw-r--r-- 1 stu stu 4918505 2010-04-21
> 15:47
> > hadoop-stu-datanode-ubuntu-hadoop-2.log
> > >> stu@ubuntu-hadoop-2:~/hadoop/logs$ tail -n 1
> > hadoop-stu-datanode-ubuntu-hadoop-2.log
> > >> 2010-04-21 15:47:21,875 INFO
> org.mortbay.log:
> > jetty-6.1.14
> > >>
> > >> Is there something wrong here? Am I just
> doing
> > something crazy/dumb?
> > >>
> > >> Note that I discovered this after getting
> some
> > rather odd behavior from hbase - hbase appears to work
> fine,
> > except it's continually adding the regionservers onto
> the
> > list so, in my itty-bitty cluster of about 4 machines
> + 1
> > namenode, the master reports back ~ 1000s of
> regionservers
> > and growing.. There are indications timing issues
> involved
> > in that as well, but I figure I should get hdfs
> working
> > first :)
> > >>
> > >> Note that I do have NTP running to keep all
> these
> > boxes in sync, but the NTP server will go down for an
> hour
> > or so intermittently (due to some network issues that
> don't
> > involve the hadoop cluster). All the system times on
> the
> > machines appear to be (at least) within a few seconds
> of
> > each other.
> > >>
> > >> I'm running hadoop 0.20.2
> > >>
> > >> Any thoughts? Any debugging guidance? I'm
> pretty
> > stumped.
> > >>
> > >> Take care,
> > >>  -stu
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > > --
> > > Eric Sammer
> > > phone: +1-917-287-2675
> > > twitter: esammer
> > > data: www.cloudera.com
> > >
> > 
> > 
> > 
> > -- 
> > Eric Sammer
> > phone: +1-917-287-2675
> > twitter: esammer
> > data: www.cloudera.com
> > 
> 
> 
>   
>