You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Funs Kessen <FK...@schubergphilis.com> on 2014/07/10 10:40:04 UTC

"Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Hi Devs,

I recently noticed on master that after starting, stopping and starting the management server again I get stacktraces about every second.
After some digging with Daan we found that part of the a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The problem is however deeper rooted, as the code is supposed to update the mshost table, but because another Session ID (actually _runId from ClusterManagerImpl.java from the previous run)  is already in there the update fails. The Session ID is based on time in milliseconds, and offcourse changes when you stop and start the management server again.
Prior to the commit it failed silently, and has done so since the initial checkin it seems. The real question is what the original idea is behind the _runId, and is it something that requires fixing ?

I've pasted part of the commit, sql and stacktrace below.

Cheers,

Funs

===
mysql> select * from mshost;
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
| id | msid           | runid         | name    | state | version        | service_ip | service_port | last_update         | removed | alert_count |
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
|  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
1 row in set (0.00 sec)

===
commit a600d8408ea86782318139c17cf346c8497943d0
Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02 10:38:16
Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-04 12:47:58

Fixed Resource Leaks, null dereferences, few other issues reported by coverity

- framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java 
diff --git a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
index 3d0c3f5..89d7d27 100644
--- a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
+++ b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
@@ -53,15 +53,14 @@
     @Override
     public void invalidateRunSession(long id, long runid) {
         TransactionLegacy txn = TransactionLegacy.currentTxn();
-        PreparedStatement pstmt = null;
-        try {
-            pstmt = txn.prepareAutoCloseStatement("update mshost set runid=0, state='Down' where id=? and runid=?");
-            pstmt.setLong(1, id);
-            pstmt.setLong(2, runid);
-
-            pstmt.executeUpdate();
+        try (PreparedStatement pstmt = txn.prepareStatement("update mshost set runid=0, state='Down' where id=? and runid=?");){
+            if(pstmt != null) {
+                pstmt.setLong(1, id);
+                pstmt.setLong(2, runid);
+                pstmt.executeUpdate();
+            }
         } catch (SQLException e) {
-            throw new CloudRuntimeException("DB exception on " + pstmt.toString(), e);
+            throw new CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(), e);
         }
     }

===
2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster session detected
com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy158.update(Unknown Source)
        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more
2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat
java.lang.RuntimeException: update:Exception:Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:155)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy158.update(Unknown Source)
        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more
Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more




RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Funs Kessen <FK...@schubergphilis.com>.
Hi Santhosh,

Yep looks good!

In the meantime I've stumbled across another one that seems to originate either from the NetworksDaoImp.java upon update or allocateVnet, still looking into it, but here's the message:
2014-07-11 11:58:01,646 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-4:ctx-3e9098c6 job-67/job-68 ctx-afb3767f) Lock is acquired for network id 205 as a part of network implement
2014-07-11 11:58:01,647 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-4:ctx-3e9098c6 job-67/job-68 ctx-afb3767f) Asking ExternalGuestNetworkGuru to implement Ntwk[205|Guest|8]
2014-07-11 11:58:01,673 DEBUG [c.c.u.d.T.Transaction] (Work-Job-Executor-4:ctx-3e9098c6 job-67/job-68 ctx-afb3767f) Rolling back the transaction: Time = 1 Name =  Work-Job-Executor-4; called by -TransactionLegacy.rollback:902-Transac
tionLegacy.removeUpTo:845-TransactionLegacy.close:669-TransactionContextInterceptor.invoke:36-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocation.proceed:172-JdkDynamicAopProxy.invo
ke:204-$Proxy45.take:-1-DataCenterDaoImpl.allocateVnet:215-NativeMethodAccessorImpl.invoke0:-2-NativeMethodAccessorImpl.invoke:57
2014-07-11 11:58:01,675 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-4:ctx-3e9098c6 job-67/job-68 ctx-afb3767f) Cleaning up because we're unable to implement the network Ntwk[205|Guest|8]
2014-07-11 11:58:01,689 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-4:ctx-3e9098c6 job-67/job-68 ctx-afb3767f) Lock is acquired for network Ntwk[205|Guest|8] as a part of network shutdown

Cheers,

Funs

> -----Original Message-----
> From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> Sent: Thursday, July 10, 2014 8:13 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> I just reverted the changes done to this specific file, then
> 
> built the management server, stopped and started again twice, issue is not
> reproducible deployed a data center post the revert on simulator, it worked.
> 
> Please check. It seems,  TransactionLegacy now implements Closeable, new
> change under master, will check the reported issues for coverity to this file
> separately post the analysis.
> 
> Regards,
> Santhosh
> ________________________________________
> From: Funs Kessen [FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 12:18 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Hi Likitha,
> 
> You are absolutely spot on! This is the error message I found:
> INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server
> 90520734207775 is being started WARN  [c.c.u.d.T.Transaction] (main:null)
> txn: Commit called when it is not a transaction: -Transaction.execute:38-
> Transaction.execute:46-ClusterManagerImpl.start:940-
> CloudStackExtendedLifeCycle$1.with:75-
> CloudStackExtendedLifeCycle.with:153-
> CloudStackExtendedLifeCycle.startBeans:72-
> CloudStackExtendedLifeCycleStart.run:46-
> DefaultModuleDefinitionSet$1.with:105-
> DefaultModuleDefinitionSet.withModule:245-
> DefaultModuleDefinitionSet.withModule:250-
> DefaultModuleDefinitionSet.withModule:250-
> DefaultModuleDefinitionSet.withModule:233
> INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server (host id : 1)
> is being started at 127.0.0.1:9090
> 
> Cheers,
> 
> Funs
> 
> > -----Original Message-----
> > From: Likitha Shetty [mailto:likitha.shetty@citrix.com]
> > Sent: Thursday, July 10, 2014 5:35 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Are you guys are referring to the update method that is throwing the
> > error, 'update(long id, long runid, Date lastUpdate) ?. Agreed that
> > this is just a symptom.
> >
> > But I was referring to the first update method in the file which is
> > 'update(long id, long runid, String name, String version, String
> > serviceIP, int servicePort, Date lastUpdate)'.
> > From what I understand this is the method that updates the MS runId
> > when it is started. And the problem is that the transaction commit
> > made in this update is getting rolled back, which is why we have a
> > wrong runId in the DB for the MS. Since the second update method is
> > based on the right runId it throws an exception.
> > And I suspect the original transaction is getting rolled back because
> > we are trying to close a transaction that is already closed.
> >
> > Thanks,
> > Likitha
> >
> > -----Original Message-----
> > From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> > Sent: Thursday, July 10, 2014 5:10 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Hi Santhosh,
> >
> > You're right indeed, it's the update statement in the update itself
> > that adds the runid in the where clause, it came in three years and
> > three months ago by the looks of it.
> >
> > Cheers,
> >
> > Funs
> >
> > > -----Original Message-----
> > > From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> > > Sent: Thursday, July 10, 2014 1:22 PM
> > > To: dev@cloudstack.apache.org
> > > Subject: RE: "Bug" introduced by part of commit
> > > a600d8408ea86782318139c17cf346c8497943d0
> > >
> > > The mentioned issue by Funs was existing earlier as well but was
> > > hidden by just logging it when update was failed, now with new
> > > change, we resurfaced the problem by throwing it. Its not because of
> > > close though. We can see to fix the original problem.
> > >
> > > Santhosh
> > > ________________________________________
> > > From: Likitha Shetty [likitha.shetty@citrix.com]
> > > Sent: Thursday, July 10, 2014 7:18 AM
> > > To: dev@cloudstack.apache.org
> > > Subject: RE: "Bug" introduced by part of commit
> > > a600d8408ea86782318139c17cf346c8497943d0
> > >
> > > Santhosh, I pulled the latest code and unfortunately the issue still exists.
> > >
> > > Looks like the problem is because we try to close the transaction
> > > after it has been committed (which implicitly closes the
> > > transaction)
> > > -
> >
> /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> > > Impl.java
> > >             txn.commit();
> > >         } catch (RuntimeException e) {
> > >             txn.rollback();
> > >             s_logger.warn("Unexpected exception, ", e);
> > >             throw new RuntimeException(e.getMessage(), e);
> > >         }finally {
> > >             txn.close();
> > >         }
> > >
> > > Thanks,
> > > Likitha
> > >
> > > -----Original Message-----
> > > From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> > > Sent: Thursday, July 10, 2014 2:38 PM
> > > To: dev@cloudstack.apache.org
> > > Subject: RE: "Bug" introduced by part of commit
> > > a600d8408ea86782318139c17cf346c8497943d0
> > >
> > > Yeah I see that Ipasted the wrong bit of code *doh*, it was update
> > > that was the problem from the same file:
> > >
> > > @@ -129,24 +133,29 @@
> > >      @DB
> > >      public void update(long id, long runid, Date lastUpdate) {
> > >          TransactionLegacy txn = TransactionLegacy.currentTxn();
> > > -        PreparedStatement pstmt = null;
> > >          try {
> > >              txn.start();
> > > +            try( PreparedStatement pstmt =
> > > + txn.prepareStatement("update mshost
> > > set last_update=?, removed=null, alert_count=0 where id=? and
> > > runid=?");) {
> > > +                pstmt.setString(1,
> > > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"),
> > > lastUpdate));
> > > +                pstmt.setLong(2, id);
> > > +                pstmt.setLong(3, runid);
> > >
> > > -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > > last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> > > -            pstmt.setString(1,
> > > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"),
> lastUpdate));
> > > -            pstmt.setLong(2, id);
> > > -            pstmt.setLong(3, runid);
> > > +                int count = pstmt.executeUpdate();
> > > +                txn.commit();
> > >
> > > -            int count = pstmt.executeUpdate();
> > > -            txn.commit();
> > > -
> > > -            if (count < 1) {
> > > -                throw new CloudRuntimeException("Invalid cluster session
> > > detected", new ClusterInvalidSessionException("runid " + runid + "
> > > is no longer valid"));
> > > +                if (count < 1) {
> > > +                    throw new CloudRuntimeException("Invalid
> > > + cluster session
> > > detected", new ClusterInvalidSessionException("runid " + runid + "
> > > is no longer valid"));
> > > +                }
> > > +            }catch (SQLException e) {
> > > +                throw new
> > > + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
> > >              }
> > > -        } catch (Exception e) {
> > > -            s_logger.warn("Unexpected exception, ", e);
> > > -            throw new RuntimeException(e.getMessage(), e);
> > > +        } catch (RuntimeException e) {
> > > +            txn.rollback();
> > > +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> > > +            throw new
> > > + RuntimeException("update:Exception:"+e.getMessage(),
> > e);
> > > +        }
> > > +        finally {
> > > +            txn.close();
> > >          }
> > >      }
> > >
> > >
> > > > -----Original Message-----
> > > > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo
> > > > Trippaers
> > > > Sent: Thursday, July 10, 2014 11:00 AM
> > > > To: dev@cloudstack.apache.org
> > > > Subject: Re: "Bug" introduced by part of commit
> > > > a600d8408ea86782318139c17cf346c8497943d0
> > > >
> > > > Fixed already, good job Santosh
> > > >
> > > > Cheers,
> > > >
> > > > Hugo
> > > >
> > > > On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> > > >
> > > > > Doesn't seem related to that particular commit, the function
> > > > invalidateRunSession appears to be dead code.
> > > > >
> > > > > I'll do some digging.
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Hugo
> > > > >
> > > > > On 10 jul. 2014, at 10:40, Funs Kessen
> > > > > <FK...@schubergphilis.com>
> > > > wrote:
> > > > >
> > > > >> Hi Devs,
> > > > >>
> > > > >> I recently noticed on master that after starting, stopping and
> > > > >> starting the
> > > > management server again I get stacktraces about every second.
> > > > >> After some digging with Daan we found that part of the
> > > > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> > > The
> > > > problem is however deeper rooted, as the code is supposed to
> > > > update the mshost table, but because another Session ID (actually
> > > > _runId from ClusterManagerImpl.java from the previous run)  is
> > > > already in there the update fails. The Session ID is based on time
> > > > in milliseconds, and offcourse changes when you stop and start the
> > management server again.
> > > > >> Prior to the commit it failed silently, and has done so since
> > > > >> the initial
> > > > checkin it seems. The real question is what the original idea is
> > > > behind the _runId, and is it something that requires fixing ?
> > > > >>
> > > > >> I've pasted part of the commit, sql and stacktrace below.
> > > > >>
> > > > >> Cheers,
> > > > >>
> > > > >> Funs
> > > > >>
> > > > >> ===
> > > > >> mysql> select * from mshost;
> > > > >> +----+----------------+---------------+---------+-------+----------------+------------+---
> ---
> > --
> > > --
> > > > ----+---------------------+---------+-------------+
> > > > >> | id | msid           | runid         | name    | state | version        |
> service_ip
> > |
> > > > service_port | last_update         | removed | alert_count |
> > > > >> +----+----------------+---------------+---------+-------+----------------+------------+---
> ---
> > --
> > > --
> > > > ----+---------------------+---------+-------------+
> > > > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > > > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |
> > 0
> > > |
> > > > >> +----+----------------+---------------+---------+-------+----------------+------------+---
> ---
> > --
> > > --
> > > > ----+---------------------+---------+-------------+
> > > > >> 1 row in set (0.00 sec)
> > > > >>
> > > > >> ===
> > > > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > > > >> Author: Santhosh Edukulla <sa...@gmail.com>
> > > > >> 2014-07-
> > > 02
> > > > >> 10:38:16
> > > > >> Committer: Santhosh Edukulla <sa...@gmail.com>
> > > > >> 2014-
> > > 07-
> > > > 04
> > > > >> 12:47:58
> > > > >>
> > > > >> Fixed Resource Leaks, null dereferences, few other issues
> > > > >> reported by coverity
> > > > >>
> > > > >> -
> > > > >>
> > > >
> > >
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > > > m
> > > > >> pl.java diff --git
> > > > >>
> > > >
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > > o
> > > > >> Impl.java
> > > > >>
> > > >
> > >
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > > o
> > > > >> Impl.java
> > > > >> index 3d0c3f5..89d7d27 100644
> > > > >> ---
> > > > >>
> > > >
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > > o
> > > > >> Impl.java
> > > > >> +++
> > > >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > > > >> +++ tDaoImpl.java
> > > > >> @@ -53,15 +53,14 @@
> > > > >>    @Override
> > > > >>    public void invalidateRunSession(long id, long runid) {
> > > > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > > > >> -        PreparedStatement pstmt = null;
> > > > >> -        try {
> > > > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > > > runid=0, state='Down' where id=? and runid=?");
> > > > >> -            pstmt.setLong(1, id);
> > > > >> -            pstmt.setLong(2, runid);
> > > > >> -
> > > > >> -            pstmt.executeUpdate();
> > > > >> +        try (PreparedStatement pstmt =
> > > > >> + txn.prepareStatement("update
> > > > mshost set runid=0, state='Down' where id=? and runid=?");){
> > > > >> +            if(pstmt != null) {
> > > > >> +                pstmt.setLong(1, id);
> > > > >> +                pstmt.setLong(2, runid);
> > > > >> +                pstmt.executeUpdate();
> > > > >> +            }
> > > > >>        } catch (SQLException e) {
> > > > >> -            throw new CloudRuntimeException("DB exception on " +
> > > > pstmt.toString(), e);
> > > > >> +            throw new
> > > > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > > > >> + e.getMessage(), e);
> > > > >>        }
> > > > >>    }
> > > > >>
> > > > >> ===
> > > > >> 2014-07-09 18:45:06,541 WARN
> > > [c.c.c.d.ManagementServerHostDaoImpl]
> > > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid
> > > > >> cluster session detected
> > > > >> com.cloud.utils.exception.CloudRuntimeException: Invalid
> > > > >> cluster session
> > > > detected
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > > ServerHostDaoImpl.java:147)
> > > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > >>       at
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > > a
> > > > va:57)
> > > > >>       at
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > > orImpl.java:43)
> > > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > > >>       at
> > > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingRefle
> > > > ct
> > > > io
> > > > n(
> > > > AopUtils.java:317)
> > > > >>       at
> > > > org.springframework.aop.framework.ReflectiveMethodInvocation.invok
> > > > eJ
> > > > oi
> > > > n
> > > > point(ReflectiveMethodInvocation.java:183)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:150)
> > > > >>       at
> > > > com.cloud.utils.db.TransactionContextInterceptor.invoke(Transactio
> > > > nC
> > > > on
> > > > tex
> > > > tInterceptor.java:34)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:161)
> > > > >>       at
> > > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.in
> > > > vo
> > > > ke
> > > > (Ex
> > > > poseInvocationInterceptor.java:91)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:172)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > > icAopProxy.java:204)
> > > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > > .java:545)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > > anagedContextRunnable.java:49)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > > l
> > > > (DefaultManagedContext.java:56)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > > ithContext(DefaultManagedContext.java:103)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > > thContext(DefaultManagedContext.java:53)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > > agedContextRunnable.java:46)
> > > > >>       at
> > > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java
> > > > :4
> > > > 71
> > > > )
> > > > >>       at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > > >>       at
> > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTa
> > > > sk
> > > > .a
> > > > c
> > > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > > >>       at
> > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTa
> > > > sk
> > > > .r
> > > > u
> > > > n(ScheduledThreadPoolExecutor.java:293)
> > > > >>       at
> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecut
> > > > or
> > > > .j
> > > > av
> > > > a:1145)
> > > > >>       at
> > > >
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > > ja
> > > > va:615)
> > > > >>       at java.lang.Thread.run(Thread.java:744)
> > > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > > >> runid
> > > > 1404924278923 is no longer valid
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > > ServerHostDaoImpl.java:147)
> > > > >>       ... 26 more
> > > > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in
> > > > >> cluster heartbeat
> > > > >> java.lang.RuntimeException: update:Exception:Invalid cluster
> > > > >> session
> > > > detected
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > > ServerHostDaoImpl.java:155)
> > > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > >>       at
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > > a
> > > > va:57)
> > > > >>       at
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > > orImpl.java:43)
> > > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > > >>       at
> > > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingRefle
> > > > ct
> > > > io
> > > > n(
> > > > AopUtils.java:317)
> > > > >>       at
> > > > org.springframework.aop.framework.ReflectiveMethodInvocation.invok
> > > > eJ
> > > > oi
> > > > n
> > > > point(ReflectiveMethodInvocation.java:183)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:150)
> > > > >>       at
> > > > com.cloud.utils.db.TransactionContextInterceptor.invoke(Transactio
> > > > nC
> > > > on
> > > > tex
> > > > tInterceptor.java:34)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:161)
> > > > >>       at
> > > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.in
> > > > vo
> > > > ke
> > > > (Ex
> > > > poseInvocationInterceptor.java:91)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > > eflectiveMethodInvocation.java:172)
> > > > >>       at
> > > >
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > > icAopProxy.java:204)
> > > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > > .java:545)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > > anagedContextRunnable.java:49)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > > l
> > > > (DefaultManagedContext.java:56)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > > ithContext(DefaultManagedContext.java:103)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > > thContext(DefaultManagedContext.java:53)
> > > > >>       at
> > > >
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > > agedContextRunnable.java:46)
> > > > >>       at
> > > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java
> > > > :4
> > > > 71
> > > > )
> > > > >>       at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > > >>       at
> > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTa
> > > > sk
> > > > .a
> > > > c
> > > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > > >>       at
> > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTa
> > > > sk
> > > > .r
> > > > u
> > > > n(ScheduledThreadPoolExecutor.java:293)
> > > > >>       at
> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecut
> > > > or
> > > > .j
> > > > av
> > > > a:1145)
> > > > >>       at
> > > >
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > > ja
> > > > va:615)
> > > > >>       at java.lang.Thread.run(Thread.java:744)
> > > > >> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> > > > >> Invalid
> > > > cluster session detected
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > > ServerHostDaoImpl.java:147)
> > > > >>       ... 26 more
> > > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > > >> runid
> > > > 1404924278923 is no longer valid
> > > > >>       at
> > > >
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > > ServerHostDaoImpl.java:147)
> > > > >>       ... 26 more
> > > > >>
> > > > >>
> > > > >>
> > > > >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Santhosh Edukulla <sa...@citrix.com>.
I just reverted the changes done to this specific file, then

built the management server, stopped and started again twice, issue is not reproducible
deployed a data center post the revert on simulator, it worked.

Please check. It seems,  TransactionLegacy now implements Closeable, new change under master, will check the reported issues for coverity to this file separately post the analysis.

Regards,
Santhosh
________________________________________
From: Funs Kessen [FKessen@schubergphilis.com]
Sent: Thursday, July 10, 2014 12:18 PM
To: dev@cloudstack.apache.org
Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Hi Likitha,

You are absolutely spot on! This is the error message I found:
INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server 90520734207775 is being started
WARN  [c.c.u.d.T.Transaction] (main:null) txn: Commit called when it is not a transaction: -Transaction.execute:38-Transaction.execute:46-ClusterManagerImpl.start:940-CloudStackExtendedLifeCycle$1.with:75-CloudStackExtendedLifeCycle.with:153-CloudStackExtendedLifeCycle.startBeans:72-CloudStackExtendedLifeCycleStart.run:46-DefaultModuleDefinitionSet$1.with:105-DefaultModuleDefinitionSet.withModule:245-DefaultModuleDefinitionSet.withModule:250-DefaultModuleDefinitionSet.withModule:250-DefaultModuleDefinitionSet.withModule:233
INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server (host id : 1) is being started at 127.0.0.1:9090

Cheers,

Funs

> -----Original Message-----
> From: Likitha Shetty [mailto:likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 5:35 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
>
> Are you guys are referring to the update method that is throwing the error,
> 'update(long id, long runid, Date lastUpdate) ?. Agreed that this is just a
> symptom.
>
> But I was referring to the first update method in the file which is
> 'update(long id, long runid, String name, String version, String serviceIP, int
> servicePort, Date lastUpdate)'.
> From what I understand this is the method that updates the MS runId when it
> is started. And the problem is that the transaction commit made in this
> update is getting rolled back, which is why we have a wrong runId in the DB
> for the MS. Since the second update method is based on the right runId it
> throws an exception.
> And I suspect the original transaction is getting rolled back because we are
> trying to close a transaction that is already closed.
>
> Thanks,
> Likitha
>
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 5:10 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
>
> Hi Santhosh,
>
> You're right indeed, it's the update statement in the update itself that adds
> the runid in the where clause, it came in three years and three months ago by
> the looks of it.
>
> Cheers,
>
> Funs
>
> > -----Original Message-----
> > From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> > Sent: Thursday, July 10, 2014 1:22 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > The mentioned issue by Funs was existing earlier as well but was
> > hidden by just logging it when update was failed, now with new change,
> > we resurfaced the problem by throwing it. Its not because of close
> > though. We can see to fix the original problem.
> >
> > Santhosh
> > ________________________________________
> > From: Likitha Shetty [likitha.shetty@citrix.com]
> > Sent: Thursday, July 10, 2014 7:18 AM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Santhosh, I pulled the latest code and unfortunately the issue still exists.
> >
> > Looks like the problem is because we try to close the transaction
> > after it has been committed (which implicitly closes the transaction)
> > -
> /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> > Impl.java
> >             txn.commit();
> >         } catch (RuntimeException e) {
> >             txn.rollback();
> >             s_logger.warn("Unexpected exception, ", e);
> >             throw new RuntimeException(e.getMessage(), e);
> >         }finally {
> >             txn.close();
> >         }
> >
> > Thanks,
> > Likitha
> >
> > -----Original Message-----
> > From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> > Sent: Thursday, July 10, 2014 2:38 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Yeah I see that Ipasted the wrong bit of code *doh*, it was update
> > that was the problem from the same file:
> >
> > @@ -129,24 +133,29 @@
> >      @DB
> >      public void update(long id, long runid, Date lastUpdate) {
> >          TransactionLegacy txn = TransactionLegacy.currentTxn();
> > -        PreparedStatement pstmt = null;
> >          try {
> >              txn.start();
> > +            try( PreparedStatement pstmt =
> > + txn.prepareStatement("update mshost
> > set last_update=?, removed=null, alert_count=0 where id=? and
> > runid=?");) {
> > +                pstmt.setString(1,
> > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"),
> > lastUpdate));
> > +                pstmt.setLong(2, id);
> > +                pstmt.setLong(3, runid);
> >
> > -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> > -            pstmt.setString(1,
> > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> > -            pstmt.setLong(2, id);
> > -            pstmt.setLong(3, runid);
> > +                int count = pstmt.executeUpdate();
> > +                txn.commit();
> >
> > -            int count = pstmt.executeUpdate();
> > -            txn.commit();
> > -
> > -            if (count < 1) {
> > -                throw new CloudRuntimeException("Invalid cluster session
> > detected", new ClusterInvalidSessionException("runid " + runid + " is
> > no longer valid"));
> > +                if (count < 1) {
> > +                    throw new CloudRuntimeException("Invalid cluster
> > + session
> > detected", new ClusterInvalidSessionException("runid " + runid + " is
> > no longer valid"));
> > +                }
> > +            }catch (SQLException e) {
> > +                throw new
> > + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
> >              }
> > -        } catch (Exception e) {
> > -            s_logger.warn("Unexpected exception, ", e);
> > -            throw new RuntimeException(e.getMessage(), e);
> > +        } catch (RuntimeException e) {
> > +            txn.rollback();
> > +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> > +            throw new RuntimeException("update:Exception:"+e.getMessage(),
> e);
> > +        }
> > +        finally {
> > +            txn.close();
> >          }
> >      }
> >
> >
> > > -----Original Message-----
> > > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> > > Sent: Thursday, July 10, 2014 11:00 AM
> > > To: dev@cloudstack.apache.org
> > > Subject: Re: "Bug" introduced by part of commit
> > > a600d8408ea86782318139c17cf346c8497943d0
> > >
> > > Fixed already, good job Santosh
> > >
> > > Cheers,
> > >
> > > Hugo
> > >
> > > On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> > >
> > > > Doesn't seem related to that particular commit, the function
> > > invalidateRunSession appears to be dead code.
> > > >
> > > > I'll do some digging.
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Hugo
> > > >
> > > > On 10 jul. 2014, at 10:40, Funs Kessen
> > > > <FK...@schubergphilis.com>
> > > wrote:
> > > >
> > > >> Hi Devs,
> > > >>
> > > >> I recently noticed on master that after starting, stopping and
> > > >> starting the
> > > management server again I get stacktraces about every second.
> > > >> After some digging with Daan we found that part of the
> > > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> > The
> > > problem is however deeper rooted, as the code is supposed to update
> > > the mshost table, but because another Session ID (actually _runId
> > > from ClusterManagerImpl.java from the previous run)  is already in
> > > there the update fails. The Session ID is based on time in
> > > milliseconds, and offcourse changes when you stop and start the
> management server again.
> > > >> Prior to the commit it failed silently, and has done so since the
> > > >> initial
> > > checkin it seems. The real question is what the original idea is
> > > behind the _runId, and is it something that requires fixing ?
> > > >>
> > > >> I've pasted part of the commit, sql and stacktrace below.
> > > >>
> > > >> Cheers,
> > > >>
> > > >> Funs
> > > >>
> > > >> ===
> > > >> mysql> select * from mshost;
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> | id | msid           | runid         | name    | state | version        | service_ip
> |
> > > service_port | last_update         | removed | alert_count |
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |
> 0
> > |
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> 1 row in set (0.00 sec)
> > > >>
> > > >> ===
> > > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > > >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> > 02
> > > >> 10:38:16
> > > >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-
> > 07-
> > > 04
> > > >> 12:47:58
> > > >>
> > > >> Fixed Resource Leaks, null dereferences, few other issues
> > > >> reported by coverity
> > > >>
> > > >> -
> > > >>
> > >
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > > m
> > > >> pl.java diff --git
> > > >>
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >>
> > >
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >> index 3d0c3f5..89d7d27 100644
> > > >> ---
> > > >>
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >> +++
> > > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > > >> +++ tDaoImpl.java
> > > >> @@ -53,15 +53,14 @@
> > > >>    @Override
> > > >>    public void invalidateRunSession(long id, long runid) {
> > > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > > >> -        PreparedStatement pstmt = null;
> > > >> -        try {
> > > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > > runid=0, state='Down' where id=? and runid=?");
> > > >> -            pstmt.setLong(1, id);
> > > >> -            pstmt.setLong(2, runid);
> > > >> -
> > > >> -            pstmt.executeUpdate();
> > > >> +        try (PreparedStatement pstmt =
> > > >> + txn.prepareStatement("update
> > > mshost set runid=0, state='Down' where id=? and runid=?");){
> > > >> +            if(pstmt != null) {
> > > >> +                pstmt.setLong(1, id);
> > > >> +                pstmt.setLong(2, runid);
> > > >> +                pstmt.executeUpdate();
> > > >> +            }
> > > >>        } catch (SQLException e) {
> > > >> -            throw new CloudRuntimeException("DB exception on " +
> > > pstmt.toString(), e);
> > > >> +            throw new
> > > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > > >> + e.getMessage(), e);
> > > >>        }
> > > >>    }
> > > >>
> > > >> ===
> > > >> 2014-07-09 18:45:06,541 WARN
> > [c.c.c.d.ManagementServerHostDaoImpl]
> > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid
> > > >> cluster session detected
> > > >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster
> > > >> session
> > > detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >>       at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > a
> > > va:57)
> > > >>       at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > orImpl.java:43)
> > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>       at
> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > > io
> > > n(
> > > AopUtils.java:317)
> > > >>       at
> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > > oi
> > > n
> > > point(ReflectiveMethodInvocation.java:183)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:150)
> > > >>       at
> > > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > > on
> > > tex
> > > tInterceptor.java:34)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:161)
> > > >>       at
> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > > ke
> > > (Ex
> > > poseInvocationInterceptor.java:91)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:172)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > icAopProxy.java:204)
> > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > >>       at
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > .java:545)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > anagedContextRunnable.java:49)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > l
> > > (DefaultManagedContext.java:56)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > ithContext(DefaultManagedContext.java:103)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > thContext(DefaultManagedContext.java:53)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > agedContextRunnable.java:46)
> > > >>       at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > > 71
> > > )
> > > >>       at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .a
> > > c
> > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .r
> > > u
> > > n(ScheduledThreadPoolExecutor.java:293)
> > > >>       at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > > .j
> > > av
> > > a:1145)
> > > >>       at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > ja
> > > va:615)
> > > >>       at java.lang.Thread.run(Thread.java:744)
> > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > >> runid
> > > 1404924278923 is no longer valid
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in
> > > >> cluster heartbeat
> > > >> java.lang.RuntimeException: update:Exception:Invalid cluster
> > > >> session
> > > detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:155)
> > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >>       at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > a
> > > va:57)
> > > >>       at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > orImpl.java:43)
> > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>       at
> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > > io
> > > n(
> > > AopUtils.java:317)
> > > >>       at
> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > > oi
> > > n
> > > point(ReflectiveMethodInvocation.java:183)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:150)
> > > >>       at
> > > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > > on
> > > tex
> > > tInterceptor.java:34)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:161)
> > > >>       at
> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > > ke
> > > (Ex
> > > poseInvocationInterceptor.java:91)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:172)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > icAopProxy.java:204)
> > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > >>       at
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > .java:545)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > anagedContextRunnable.java:49)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > l
> > > (DefaultManagedContext.java:56)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > ithContext(DefaultManagedContext.java:103)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > thContext(DefaultManagedContext.java:53)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > agedContextRunnable.java:46)
> > > >>       at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > > 71
> > > )
> > > >>       at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .a
> > > c
> > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .r
> > > u
> > > n(ScheduledThreadPoolExecutor.java:293)
> > > >>       at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > > .j
> > > av
> > > a:1145)
> > > >>       at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > ja
> > > va:615)
> > > >>       at java.lang.Thread.run(Thread.java:744)
> > > >> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> > > >> Invalid
> > > cluster session detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > >> runid
> > > 1404924278923 is no longer valid
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >>
> > > >>
> > > >>
> > > >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Funs Kessen <FK...@schubergphilis.com>.
Hi Likitha,

You are absolutely spot on! This is the error message I found:
INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server 90520734207775 is being started
WARN  [c.c.u.d.T.Transaction] (main:null) txn: Commit called when it is not a transaction: -Transaction.execute:38-Transaction.execute:46-ClusterManagerImpl.start:940-CloudStackExtendedLifeCycle$1.with:75-CloudStackExtendedLifeCycle.with:153-CloudStackExtendedLifeCycle.startBeans:72-CloudStackExtendedLifeCycleStart.run:46-DefaultModuleDefinitionSet$1.with:105-DefaultModuleDefinitionSet.withModule:245-DefaultModuleDefinitionSet.withModule:250-DefaultModuleDefinitionSet.withModule:250-DefaultModuleDefinitionSet.withModule:233
INFO  [c.c.c.ClusterManagerImpl] (main:null) Management server (host id : 1) is being started at 127.0.0.1:9090

Cheers,

Funs

> -----Original Message-----
> From: Likitha Shetty [mailto:likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 5:35 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Are you guys are referring to the update method that is throwing the error,
> 'update(long id, long runid, Date lastUpdate) ?. Agreed that this is just a
> symptom.
> 
> But I was referring to the first update method in the file which is
> 'update(long id, long runid, String name, String version, String serviceIP, int
> servicePort, Date lastUpdate)'.
> From what I understand this is the method that updates the MS runId when it
> is started. And the problem is that the transaction commit made in this
> update is getting rolled back, which is why we have a wrong runId in the DB
> for the MS. Since the second update method is based on the right runId it
> throws an exception.
> And I suspect the original transaction is getting rolled back because we are
> trying to close a transaction that is already closed.
> 
> Thanks,
> Likitha
> 
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 5:10 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Hi Santhosh,
> 
> You're right indeed, it's the update statement in the update itself that adds
> the runid in the where clause, it came in three years and three months ago by
> the looks of it.
> 
> Cheers,
> 
> Funs
> 
> > -----Original Message-----
> > From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> > Sent: Thursday, July 10, 2014 1:22 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > The mentioned issue by Funs was existing earlier as well but was
> > hidden by just logging it when update was failed, now with new change,
> > we resurfaced the problem by throwing it. Its not because of close
> > though. We can see to fix the original problem.
> >
> > Santhosh
> > ________________________________________
> > From: Likitha Shetty [likitha.shetty@citrix.com]
> > Sent: Thursday, July 10, 2014 7:18 AM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Santhosh, I pulled the latest code and unfortunately the issue still exists.
> >
> > Looks like the problem is because we try to close the transaction
> > after it has been committed (which implicitly closes the transaction)
> > -
> /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> > Impl.java
> >             txn.commit();
> >         } catch (RuntimeException e) {
> >             txn.rollback();
> >             s_logger.warn("Unexpected exception, ", e);
> >             throw new RuntimeException(e.getMessage(), e);
> >         }finally {
> >             txn.close();
> >         }
> >
> > Thanks,
> > Likitha
> >
> > -----Original Message-----
> > From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> > Sent: Thursday, July 10, 2014 2:38 PM
> > To: dev@cloudstack.apache.org
> > Subject: RE: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Yeah I see that Ipasted the wrong bit of code *doh*, it was update
> > that was the problem from the same file:
> >
> > @@ -129,24 +133,29 @@
> >      @DB
> >      public void update(long id, long runid, Date lastUpdate) {
> >          TransactionLegacy txn = TransactionLegacy.currentTxn();
> > -        PreparedStatement pstmt = null;
> >          try {
> >              txn.start();
> > +            try( PreparedStatement pstmt =
> > + txn.prepareStatement("update mshost
> > set last_update=?, removed=null, alert_count=0 where id=? and
> > runid=?");) {
> > +                pstmt.setString(1,
> > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"),
> > lastUpdate));
> > +                pstmt.setLong(2, id);
> > +                pstmt.setLong(3, runid);
> >
> > -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> > -            pstmt.setString(1,
> > DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> > -            pstmt.setLong(2, id);
> > -            pstmt.setLong(3, runid);
> > +                int count = pstmt.executeUpdate();
> > +                txn.commit();
> >
> > -            int count = pstmt.executeUpdate();
> > -            txn.commit();
> > -
> > -            if (count < 1) {
> > -                throw new CloudRuntimeException("Invalid cluster session
> > detected", new ClusterInvalidSessionException("runid " + runid + " is
> > no longer valid"));
> > +                if (count < 1) {
> > +                    throw new CloudRuntimeException("Invalid cluster
> > + session
> > detected", new ClusterInvalidSessionException("runid " + runid + " is
> > no longer valid"));
> > +                }
> > +            }catch (SQLException e) {
> > +                throw new
> > + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
> >              }
> > -        } catch (Exception e) {
> > -            s_logger.warn("Unexpected exception, ", e);
> > -            throw new RuntimeException(e.getMessage(), e);
> > +        } catch (RuntimeException e) {
> > +            txn.rollback();
> > +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> > +            throw new RuntimeException("update:Exception:"+e.getMessage(),
> e);
> > +        }
> > +        finally {
> > +            txn.close();
> >          }
> >      }
> >
> >
> > > -----Original Message-----
> > > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> > > Sent: Thursday, July 10, 2014 11:00 AM
> > > To: dev@cloudstack.apache.org
> > > Subject: Re: "Bug" introduced by part of commit
> > > a600d8408ea86782318139c17cf346c8497943d0
> > >
> > > Fixed already, good job Santosh
> > >
> > > Cheers,
> > >
> > > Hugo
> > >
> > > On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> > >
> > > > Doesn't seem related to that particular commit, the function
> > > invalidateRunSession appears to be dead code.
> > > >
> > > > I'll do some digging.
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Hugo
> > > >
> > > > On 10 jul. 2014, at 10:40, Funs Kessen
> > > > <FK...@schubergphilis.com>
> > > wrote:
> > > >
> > > >> Hi Devs,
> > > >>
> > > >> I recently noticed on master that after starting, stopping and
> > > >> starting the
> > > management server again I get stacktraces about every second.
> > > >> After some digging with Daan we found that part of the
> > > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> > The
> > > problem is however deeper rooted, as the code is supposed to update
> > > the mshost table, but because another Session ID (actually _runId
> > > from ClusterManagerImpl.java from the previous run)  is already in
> > > there the update fails. The Session ID is based on time in
> > > milliseconds, and offcourse changes when you stop and start the
> management server again.
> > > >> Prior to the commit it failed silently, and has done so since the
> > > >> initial
> > > checkin it seems. The real question is what the original idea is
> > > behind the _runId, and is it something that requires fixing ?
> > > >>
> > > >> I've pasted part of the commit, sql and stacktrace below.
> > > >>
> > > >> Cheers,
> > > >>
> > > >> Funs
> > > >>
> > > >> ===
> > > >> mysql> select * from mshost;
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> | id | msid           | runid         | name    | state | version        | service_ip
> |
> > > service_port | last_update         | removed | alert_count |
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |
> 0
> > |
> > > >> +----+----------------+---------------+---------+-------+----------------+------------+------
> --
> > --
> > > ----+---------------------+---------+-------------+
> > > >> 1 row in set (0.00 sec)
> > > >>
> > > >> ===
> > > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > > >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> > 02
> > > >> 10:38:16
> > > >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-
> > 07-
> > > 04
> > > >> 12:47:58
> > > >>
> > > >> Fixed Resource Leaks, null dereferences, few other issues
> > > >> reported by coverity
> > > >>
> > > >> -
> > > >>
> > >
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > > m
> > > >> pl.java diff --git
> > > >>
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >>
> > >
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >> index 3d0c3f5..89d7d27 100644
> > > >> ---
> > > >>
> > >
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > > o
> > > >> Impl.java
> > > >> +++
> > > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > > >> +++ tDaoImpl.java
> > > >> @@ -53,15 +53,14 @@
> > > >>    @Override
> > > >>    public void invalidateRunSession(long id, long runid) {
> > > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > > >> -        PreparedStatement pstmt = null;
> > > >> -        try {
> > > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > > runid=0, state='Down' where id=? and runid=?");
> > > >> -            pstmt.setLong(1, id);
> > > >> -            pstmt.setLong(2, runid);
> > > >> -
> > > >> -            pstmt.executeUpdate();
> > > >> +        try (PreparedStatement pstmt =
> > > >> + txn.prepareStatement("update
> > > mshost set runid=0, state='Down' where id=? and runid=?");){
> > > >> +            if(pstmt != null) {
> > > >> +                pstmt.setLong(1, id);
> > > >> +                pstmt.setLong(2, runid);
> > > >> +                pstmt.executeUpdate();
> > > >> +            }
> > > >>        } catch (SQLException e) {
> > > >> -            throw new CloudRuntimeException("DB exception on " +
> > > pstmt.toString(), e);
> > > >> +            throw new
> > > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > > >> + e.getMessage(), e);
> > > >>        }
> > > >>    }
> > > >>
> > > >> ===
> > > >> 2014-07-09 18:45:06,541 WARN
> > [c.c.c.d.ManagementServerHostDaoImpl]
> > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid
> > > >> cluster session detected
> > > >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster
> > > >> session
> > > detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >>       at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > a
> > > va:57)
> > > >>       at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > orImpl.java:43)
> > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>       at
> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > > io
> > > n(
> > > AopUtils.java:317)
> > > >>       at
> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > > oi
> > > n
> > > point(ReflectiveMethodInvocation.java:183)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:150)
> > > >>       at
> > > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > > on
> > > tex
> > > tInterceptor.java:34)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:161)
> > > >>       at
> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > > ke
> > > (Ex
> > > poseInvocationInterceptor.java:91)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:172)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > icAopProxy.java:204)
> > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > >>       at
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > .java:545)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > anagedContextRunnable.java:49)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > l
> > > (DefaultManagedContext.java:56)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > ithContext(DefaultManagedContext.java:103)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > thContext(DefaultManagedContext.java:53)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > agedContextRunnable.java:46)
> > > >>       at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > > 71
> > > )
> > > >>       at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .a
> > > c
> > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .r
> > > u
> > > n(ScheduledThreadPoolExecutor.java:293)
> > > >>       at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > > .j
> > > av
> > > a:1145)
> > > >>       at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > ja
> > > va:615)
> > > >>       at java.lang.Thread.run(Thread.java:744)
> > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > >> runid
> > > 1404924278923 is no longer valid
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in
> > > >> cluster heartbeat
> > > >> java.lang.RuntimeException: update:Exception:Invalid cluster
> > > >> session
> > > detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:155)
> > > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >>       at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > a
> > > va:57)
> > > >>       at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > orImpl.java:43)
> > > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>       at
> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > > io
> > > n(
> > > AopUtils.java:317)
> > > >>       at
> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > > oi
> > > n
> > > point(ReflectiveMethodInvocation.java:183)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:150)
> > > >>       at
> > > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > > on
> > > tex
> > > tInterceptor.java:34)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:161)
> > > >>       at
> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > > ke
> > > (Ex
> > > poseInvocationInterceptor.java:91)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > > eflectiveMethodInvocation.java:172)
> > > >>       at
> > >
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > > icAopProxy.java:204)
> > > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > > >>       at
> > >
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > > .java:545)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > > anagedContextRunnable.java:49)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > > l
> > > (DefaultManagedContext.java:56)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > > ithContext(DefaultManagedContext.java:103)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > > thContext(DefaultManagedContext.java:53)
> > > >>       at
> > >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > > agedContextRunnable.java:46)
> > > >>       at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > > 71
> > > )
> > > >>       at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .a
> > > c
> > > cess$301(ScheduledThreadPoolExecutor.java:178)
> > > >>       at
> > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > > .r
> > > u
> > > n(ScheduledThreadPoolExecutor.java:293)
> > > >>       at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > > .j
> > > av
> > > a:1145)
> > > >>       at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > ja
> > > va:615)
> > > >>       at java.lang.Thread.run(Thread.java:744)
> > > >> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> > > >> Invalid
> > > cluster session detected
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException:
> > > >> runid
> > > 1404924278923 is no longer valid
> > > >>       at
> > >
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > > ServerHostDaoImpl.java:147)
> > > >>       ... 26 more
> > > >>
> > > >>
> > > >>
> > > >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Likitha Shetty <li...@citrix.com>.
Are you guys are referring to the update method that is throwing the error, 'update(long id, long runid, Date lastUpdate) ?. Agreed that this is just a symptom.

But I was referring to the first update method in the file which is 'update(long id, long runid, String name, String version, String serviceIP, int servicePort, Date lastUpdate)'.
>From what I understand this is the method that updates the MS runId when it is started. And the problem is that the transaction commit made in this update is getting rolled back, which is why we have a wrong runId in the DB for the MS. Since the second update method is based on the right runId it throws an exception.
And I suspect the original transaction is getting rolled back because we are trying to close a transaction that is already closed.

Thanks,
Likitha

-----Original Message-----
From: Funs Kessen [mailto:FKessen@schubergphilis.com] 
Sent: Thursday, July 10, 2014 5:10 PM
To: dev@cloudstack.apache.org
Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Hi Santhosh,

You're right indeed, it's the update statement in the update itself that adds the runid in the where clause, it came in three years and three months ago by the looks of it.

Cheers,

Funs

> -----Original Message-----
> From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> Sent: Thursday, July 10, 2014 1:22 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> The mentioned issue by Funs was existing earlier as well but was 
> hidden by just logging it when update was failed, now with new change, 
> we resurfaced the problem by throwing it. Its not because of close 
> though. We can see to fix the original problem.
> 
> Santhosh
> ________________________________________
> From: Likitha Shetty [likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 7:18 AM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Santhosh, I pulled the latest code and unfortunately the issue still exists.
> 
> Looks like the problem is because we try to close the transaction 
> after it has been committed (which implicitly closes the transaction) 
> - /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> Impl.java
>             txn.commit();
>         } catch (RuntimeException e) {
>             txn.rollback();
>             s_logger.warn("Unexpected exception, ", e);
>             throw new RuntimeException(e.getMessage(), e);
>         }finally {
>             txn.close();
>         }
> 
> Thanks,
> Likitha
> 
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 2:38 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Yeah I see that Ipasted the wrong bit of code *doh*, it was update 
> that was the problem from the same file:
> 
> @@ -129,24 +133,29 @@
>      @DB
>      public void update(long id, long runid, Date lastUpdate) {
>          TransactionLegacy txn = TransactionLegacy.currentTxn();
> -        PreparedStatement pstmt = null;
>          try {
>              txn.start();
> +            try( PreparedStatement pstmt = 
> + txn.prepareStatement("update mshost
> set last_update=?, removed=null, alert_count=0 where id=? and 
> runid=?");) {
> +                pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), 
> lastUpdate));
> +                pstmt.setLong(2, id);
> +                pstmt.setLong(3, runid);
> 
> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> -            pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> -            pstmt.setLong(2, id);
> -            pstmt.setLong(3, runid);
> +                int count = pstmt.executeUpdate();
> +                txn.commit();
> 
> -            int count = pstmt.executeUpdate();
> -            txn.commit();
> -
> -            if (count < 1) {
> -                throw new CloudRuntimeException("Invalid cluster session
> detected", new ClusterInvalidSessionException("runid " + runid + " is 
> no longer valid"));
> +                if (count < 1) {
> +                    throw new CloudRuntimeException("Invalid cluster 
> + session
> detected", new ClusterInvalidSessionException("runid " + runid + " is 
> no longer valid"));
> +                }
> +            }catch (SQLException e) {
> +                throw new
> + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
>              }
> -        } catch (Exception e) {
> -            s_logger.warn("Unexpected exception, ", e);
> -            throw new RuntimeException(e.getMessage(), e);
> +        } catch (RuntimeException e) {
> +            txn.rollback();
> +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> +            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
> +        }
> +        finally {
> +            txn.close();
>          }
>      }
> 
> 
> > -----Original Message-----
> > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> > Sent: Thursday, July 10, 2014 11:00 AM
> > To: dev@cloudstack.apache.org
> > Subject: Re: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Fixed already, good job Santosh
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> >
> > > Doesn't seem related to that particular commit, the function
> > invalidateRunSession appears to be dead code.
> > >
> > > I'll do some digging.
> > >
> > >
> > > Cheers,
> > >
> > > Hugo
> > >
> > > On 10 jul. 2014, at 10:40, Funs Kessen 
> > > <FK...@schubergphilis.com>
> > wrote:
> > >
> > >> Hi Devs,
> > >>
> > >> I recently noticed on master that after starting, stopping and 
> > >> starting the
> > management server again I get stacktraces about every second.
> > >> After some digging with Daan we found that part of the
> > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> The
> > problem is however deeper rooted, as the code is supposed to update 
> > the mshost table, but because another Session ID (actually _runId 
> > from ClusterManagerImpl.java from the previous run)  is already in 
> > there the update fails. The Session ID is based on time in 
> > milliseconds, and offcourse changes when you stop and start the management server again.
> > >> Prior to the commit it failed silently, and has done so since the 
> > >> initial
> > checkin it seems. The real question is what the original idea is 
> > behind the _runId, and is it something that requires fixing ?
> > >>
> > >> I've pasted part of the commit, sql and stacktrace below.
> > >>
> > >> Cheers,
> > >>
> > >> Funs
> > >>
> > >> ===
> > >> mysql> select * from mshost;
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> | id | msid           | runid         | name    | state | version        | service_ip |
> > service_port | last_update         | removed | alert_count |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0
> |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> 1 row in set (0.00 sec)
> > >>
> > >> ===
> > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> 02
> > >> 10:38:16
> > >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-
> 07-
> > 04
> > >> 12:47:58
> > >>
> > >> Fixed Resource Leaks, null dereferences, few other issues 
> > >> reported by coverity
> > >>
> > >> -
> > >>
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > m
> > >> pl.java diff --git
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >>
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> index 3d0c3f5..89d7d27 100644
> > >> ---
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> +++
> > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > >> +++ tDaoImpl.java
> > >> @@ -53,15 +53,14 @@
> > >>    @Override
> > >>    public void invalidateRunSession(long id, long runid) {
> > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > >> -        PreparedStatement pstmt = null;
> > >> -        try {
> > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > runid=0, state='Down' where id=? and runid=?");
> > >> -            pstmt.setLong(1, id);
> > >> -            pstmt.setLong(2, runid);
> > >> -
> > >> -            pstmt.executeUpdate();
> > >> +        try (PreparedStatement pstmt = 
> > >> + txn.prepareStatement("update
> > mshost set runid=0, state='Down' where id=? and runid=?");){
> > >> +            if(pstmt != null) {
> > >> +                pstmt.setLong(1, id);
> > >> +                pstmt.setLong(2, runid);
> > >> +                pstmt.executeUpdate();
> > >> +            }
> > >>        } catch (SQLException e) {
> > >> -            throw new CloudRuntimeException("DB exception on " +
> > pstmt.toString(), e);
> > >> +            throw new
> > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > >> + e.getMessage(), e);
> > >>        }
> > >>    }
> > >>
> > >> ===
> > >> 2014-07-09 18:45:06,541 WARN
> [c.c.c.d.ManagementServerHostDaoImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid 
> > >> cluster session detected
> > >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster 
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > io
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > oi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > on
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > ke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > 71
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > .a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > .r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > .j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: 
> > >> runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in 
> > >> cluster heartbeat
> > >> java.lang.RuntimeException: update:Exception:Invalid cluster 
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:155)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflect
> > io
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJ
> > oi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionC
> > on
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invo
> > ke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:4
> > 71
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > .a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
> > .r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
> > .j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.utils.exception.CloudRuntimeException: 
> > >> Invalid
> > cluster session detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: 
> > >> runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >>
> > >>
> > >>
> > >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Funs Kessen <FK...@schubergphilis.com>.
Hi Santhosh,

You're right indeed, it's the update statement in the update itself that adds the runid in the where clause, it came in three years and three months ago by the looks of it.

Cheers,

Funs

> -----Original Message-----
> From: Santhosh Edukulla [mailto:santhosh.edukulla@citrix.com]
> Sent: Thursday, July 10, 2014 1:22 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> The mentioned issue by Funs was existing earlier as well but was hidden by
> just logging it when update was failed, now with new change, we resurfaced
> the problem by throwing it. Its not because of close though. We can see to fix
> the original problem.
> 
> Santhosh
> ________________________________________
> From: Likitha Shetty [likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 7:18 AM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Santhosh, I pulled the latest code and unfortunately the issue still exists.
> 
> Looks like the problem is because we try to close the transaction after it has
> been committed (which implicitly closes the transaction) -
> /framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDao
> Impl.java
>             txn.commit();
>         } catch (RuntimeException e) {
>             txn.rollback();
>             s_logger.warn("Unexpected exception, ", e);
>             throw new RuntimeException(e.getMessage(), e);
>         }finally {
>             txn.close();
>         }
> 
> Thanks,
> Likitha
> 
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 2:38 PM
> To: dev@cloudstack.apache.org
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Yeah I see that Ipasted the wrong bit of code *doh*, it was update that was
> the problem from the same file:
> 
> @@ -129,24 +133,29 @@
>      @DB
>      public void update(long id, long runid, Date lastUpdate) {
>          TransactionLegacy txn = TransactionLegacy.currentTxn();
> -        PreparedStatement pstmt = null;
>          try {
>              txn.start();
> +            try( PreparedStatement pstmt = txn.prepareStatement("update mshost
> set last_update=?, removed=null, alert_count=0 where id=? and runid=?");) {
> +                pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> +                pstmt.setLong(2, id);
> +                pstmt.setLong(3, runid);
> 
> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> last_update=?, removed=null, alert_count=0 where id=? and runid=?");
> -            pstmt.setString(1,
> DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
> -            pstmt.setLong(2, id);
> -            pstmt.setLong(3, runid);
> +                int count = pstmt.executeUpdate();
> +                txn.commit();
> 
> -            int count = pstmt.executeUpdate();
> -            txn.commit();
> -
> -            if (count < 1) {
> -                throw new CloudRuntimeException("Invalid cluster session
> detected", new ClusterInvalidSessionException("runid " + runid + " is no
> longer valid"));
> +                if (count < 1) {
> +                    throw new CloudRuntimeException("Invalid cluster session
> detected", new ClusterInvalidSessionException("runid " + runid + " is no
> longer valid"));
> +                }
> +            }catch (SQLException e) {
> +                throw new
> + CloudRuntimeException("update:Exception:"+e.getMessage(), e);
>              }
> -        } catch (Exception e) {
> -            s_logger.warn("Unexpected exception, ", e);
> -            throw new RuntimeException(e.getMessage(), e);
> +        } catch (RuntimeException e) {
> +            txn.rollback();
> +            s_logger.warn("update:Exception:"+e.getMessage(), e);
> +            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
> +        }
> +        finally {
> +            txn.close();
>          }
>      }
> 
> 
> > -----Original Message-----
> > From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> > Sent: Thursday, July 10, 2014 11:00 AM
> > To: dev@cloudstack.apache.org
> > Subject: Re: "Bug" introduced by part of commit
> > a600d8408ea86782318139c17cf346c8497943d0
> >
> > Fixed already, good job Santosh
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> >
> > > Doesn't seem related to that particular commit, the function
> > invalidateRunSession appears to be dead code.
> > >
> > > I'll do some digging.
> > >
> > >
> > > Cheers,
> > >
> > > Hugo
> > >
> > > On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com>
> > wrote:
> > >
> > >> Hi Devs,
> > >>
> > >> I recently noticed on master that after starting, stopping and
> > >> starting the
> > management server again I get stacktraces about every second.
> > >> After some digging with Daan we found that part of the
> > a600d8408ea86782318139c17cf346c84979943d0 commit causes this.
> The
> > problem is however deeper rooted, as the code is supposed to update
> > the mshost table, but because another Session ID (actually _runId from
> > ClusterManagerImpl.java from the previous run)  is already in there
> > the update fails. The Session ID is based on time in milliseconds, and
> > offcourse changes when you stop and start the management server again.
> > >> Prior to the commit it failed silently, and has done so since the
> > >> initial
> > checkin it seems. The real question is what the original idea is
> > behind the _runId, and is it something that requires fixing ?
> > >>
> > >> I've pasted part of the commit, sql and stacktrace below.
> > >>
> > >> Cheers,
> > >>
> > >> Funs
> > >>
> > >> ===
> > >> mysql> select * from mshost;
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> | id | msid           | runid         | name    | state | version        | service_ip |
> > service_port | last_update         | removed | alert_count |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> > SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0
> |
> > >> +----+----------------+---------------+---------+-------+----------------+------------+--------
> --
> > ----+---------------------+---------+-------------+
> > >> 1 row in set (0.00 sec)
> > >>
> > >> ===
> > >> commit a600d8408ea86782318139c17cf346c8497943d0
> > >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> 02
> > >> 10:38:16
> > >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-
> 07-
> > 04
> > >> 12:47:58
> > >>
> > >> Fixed Resource Leaks, null dereferences, few other issues reported
> > >> by coverity
> > >>
> > >> -
> > >>
> >
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> > m
> > >> pl.java diff --git
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >>
> >
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> index 3d0c3f5..89d7d27 100644
> > >> ---
> > >>
> >
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> > o
> > >> Impl.java
> > >> +++
> > b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> > >> +++ tDaoImpl.java
> > >> @@ -53,15 +53,14 @@
> > >>    @Override
> > >>    public void invalidateRunSession(long id, long runid) {
> > >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> > >> -        PreparedStatement pstmt = null;
> > >> -        try {
> > >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> > runid=0, state='Down' where id=? and runid=?");
> > >> -            pstmt.setLong(1, id);
> > >> -            pstmt.setLong(2, runid);
> > >> -
> > >> -            pstmt.executeUpdate();
> > >> +        try (PreparedStatement pstmt =
> > >> + txn.prepareStatement("update
> > mshost set runid=0, state='Down' where id=? and runid=?");){
> > >> +            if(pstmt != null) {
> > >> +                pstmt.setLong(1, id);
> > >> +                pstmt.setLong(2, runid);
> > >> +                pstmt.executeUpdate();
> > >> +            }
> > >>        } catch (SQLException e) {
> > >> -            throw new CloudRuntimeException("DB exception on " +
> > pstmt.toString(), e);
> > >> +            throw new
> > >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> > >> + e.getMessage(), e);
> > >>        }
> > >>    }
> > >>
> > >> ===
> > >> 2014-07-09 18:45:06,541 WARN
> [c.c.c.d.ManagementServerHostDaoImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster
> > >> session detected
> > >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> > >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster
> > >> heartbeat
> > >> java.lang.RuntimeException: update:Exception:Invalid cluster
> > >> session
> > detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:155)
> > >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>       at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > a
> > va:57)
> > >>       at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > >>       at java.lang.reflect.Method.invoke(Method.java:606)
> > >>       at
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> > n(
> > AopUtils.java:317)
> > >>       at
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> > n
> > point(ReflectiveMethodInvocation.java:183)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:150)
> > >>       at
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> > tex
> > tInterceptor.java:34)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:161)
> > >>       at
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> > (Ex
> > poseInvocationInterceptor.java:91)
> > >>       at
> >
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> > eflectiveMethodInvocation.java:172)
> > >>       at
> >
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> > icAopProxy.java:204)
> > >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> > >>       at
> >
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> > .java:545)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> > anagedContextRunnable.java:49)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> > l
> > (DefaultManagedContext.java:56)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> > ithContext(DefaultManagedContext.java:103)
> > >>       at
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> > thContext(DefaultManagedContext.java:53)
> > >>       at
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> > agedContextRunnable.java:46)
> > >>       at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> > )
> > >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> > c
> > cess$301(ScheduledThreadPoolExecutor.java:178)
> > >>       at
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> > u
> > n(ScheduledThreadPoolExecutor.java:293)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > av
> > a:1145)
> > >>       at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > ja
> > va:615)
> > >>       at java.lang.Thread.run(Thread.java:744)
> > >> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> > cluster session detected
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> > 1404924278923 is no longer valid
> > >>       at
> >
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> > ServerHostDaoImpl.java:147)
> > >>       ... 26 more
> > >>
> > >>
> > >>
> > >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Santhosh Edukulla <sa...@citrix.com>.
The mentioned issue by Funs was existing earlier as well but was hidden by just logging it when update was failed, now with new change, we resurfaced the problem by throwing it. Its not because of close though. We can see to fix the original problem. 

Santhosh
________________________________________
From: Likitha Shetty [likitha.shetty@citrix.com]
Sent: Thursday, July 10, 2014 7:18 AM
To: dev@cloudstack.apache.org
Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Santhosh, I pulled the latest code and unfortunately the issue still exists.

Looks like the problem is because we try to close the transaction after it has been committed (which implicitly closes the transaction) -
/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
            txn.commit();
        } catch (RuntimeException e) {
            txn.rollback();
            s_logger.warn("Unexpected exception, ", e);
            throw new RuntimeException(e.getMessage(), e);
        }finally {
            txn.close();
        }

Thanks,
Likitha

-----Original Message-----
From: Funs Kessen [mailto:FKessen@schubergphilis.com]
Sent: Thursday, July 10, 2014 2:38 PM
To: dev@cloudstack.apache.org
Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Yeah I see that Ipasted the wrong bit of code *doh*, it was update that was the problem from the same file:

@@ -129,24 +133,29 @@
     @DB
     public void update(long id, long runid, Date lastUpdate) {
         TransactionLegacy txn = TransactionLegacy.currentTxn();
-        PreparedStatement pstmt = null;
         try {
             txn.start();
+            try( PreparedStatement pstmt = txn.prepareStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");) {
+                pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
+                pstmt.setLong(2, id);
+                pstmt.setLong(3, runid);

-            pstmt = txn.prepareAutoCloseStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");
-            pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
-            pstmt.setLong(2, id);
-            pstmt.setLong(3, runid);
+                int count = pstmt.executeUpdate();
+                txn.commit();

-            int count = pstmt.executeUpdate();
-            txn.commit();
-
-            if (count < 1) {
-                throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                if (count < 1) {
+                    throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                }
+            }catch (SQLException e) {
+                throw new
+ CloudRuntimeException("update:Exception:"+e.getMessage(), e);
             }
-        } catch (Exception e) {
-            s_logger.warn("Unexpected exception, ", e);
-            throw new RuntimeException(e.getMessage(), e);
+        } catch (RuntimeException e) {
+            txn.rollback();
+            s_logger.warn("update:Exception:"+e.getMessage(), e);
+            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
+        }
+        finally {
+            txn.close();
         }
     }


> -----Original Message-----
> From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> Sent: Thursday, July 10, 2014 11:00 AM
> To: dev@cloudstack.apache.org
> Subject: Re: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
>
> Fixed already, good job Santosh
>
> Cheers,
>
> Hugo
>
> On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
>
> > Doesn't seem related to that particular commit, the function
> invalidateRunSession appears to be dead code.
> >
> > I'll do some digging.
> >
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com>
> wrote:
> >
> >> Hi Devs,
> >>
> >> I recently noticed on master that after starting, stopping and
> >> starting the
> management server again I get stacktraces about every second.
> >> After some digging with Daan we found that part of the
> a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The
> problem is however deeper rooted, as the code is supposed to update
> the mshost table, but because another Session ID (actually _runId from
> ClusterManagerImpl.java from the previous run)  is already in there
> the update fails. The Session ID is based on time in milliseconds, and
> offcourse changes when you stop and start the management server again.
> >> Prior to the commit it failed silently, and has done so since the
> >> initial
> checkin it seems. The real question is what the original idea is
> behind the _runId, and is it something that requires fixing ?
> >>
> >> I've pasted part of the commit, sql and stacktrace below.
> >>
> >> Cheers,
> >>
> >> Funs
> >>
> >> ===
> >> mysql> select * from mshost;
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> | id | msid           | runid         | name    | state | version        | service_ip |
> service_port | last_update         | removed | alert_count |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> 1 row in set (0.00 sec)
> >>
> >> ===
> >> commit a600d8408ea86782318139c17cf346c8497943d0
> >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02
> >> 10:38:16
> >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> 04
> >> 12:47:58
> >>
> >> Fixed Resource Leaks, null dereferences, few other issues reported
> >> by coverity
> >>
> >> -
> >>
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> m
> >> pl.java diff --git
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >>
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> index 3d0c3f5..89d7d27 100644
> >> ---
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> +++
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> >> +++ tDaoImpl.java
> >> @@ -53,15 +53,14 @@
> >>    @Override
> >>    public void invalidateRunSession(long id, long runid) {
> >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> >> -        PreparedStatement pstmt = null;
> >> -        try {
> >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> runid=0, state='Down' where id=? and runid=?");
> >> -            pstmt.setLong(1, id);
> >> -            pstmt.setLong(2, runid);
> >> -
> >> -            pstmt.executeUpdate();
> >> +        try (PreparedStatement pstmt =
> >> + txn.prepareStatement("update
> mshost set runid=0, state='Down' where id=? and runid=?");){
> >> +            if(pstmt != null) {
> >> +                pstmt.setLong(1, id);
> >> +                pstmt.setLong(2, runid);
> >> +                pstmt.executeUpdate();
> >> +            }
> >>        } catch (SQLException e) {
> >> -            throw new CloudRuntimeException("DB exception on " +
> pstmt.toString(), e);
> >> +            throw new
> >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> >> + e.getMessage(), e);
> >>        }
> >>    }
> >>
> >> ===
> >> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster
> >> session detected
> >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster
> >> session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> a
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> n(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> n
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> tex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> (Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> l
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> )
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> c
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> u
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> av
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster
> >> heartbeat
> >> java.lang.RuntimeException: update:Exception:Invalid cluster
> >> session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:155)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> a
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> n(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> n
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> tex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> (Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> l
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> )
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> c
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> u
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> av
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> cluster session detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >>
> >>
> >>
> >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Likitha Shetty <li...@citrix.com>.
Santhosh, I pulled the latest code and unfortunately the issue still exists. 

Looks like the problem is because we try to close the transaction after it has been committed (which implicitly closes the transaction) -
/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
            txn.commit();
        } catch (RuntimeException e) {
            txn.rollback();
            s_logger.warn("Unexpected exception, ", e);
            throw new RuntimeException(e.getMessage(), e);
        }finally {
            txn.close();
        }

Thanks,
Likitha

-----Original Message-----
From: Funs Kessen [mailto:FKessen@schubergphilis.com] 
Sent: Thursday, July 10, 2014 2:38 PM
To: dev@cloudstack.apache.org
Subject: RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Yeah I see that Ipasted the wrong bit of code *doh*, it was update that was the problem from the same file:

@@ -129,24 +133,29 @@
     @DB
     public void update(long id, long runid, Date lastUpdate) {
         TransactionLegacy txn = TransactionLegacy.currentTxn();
-        PreparedStatement pstmt = null;
         try {
             txn.start();
+            try( PreparedStatement pstmt = txn.prepareStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");) {
+                pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
+                pstmt.setLong(2, id);
+                pstmt.setLong(3, runid);
 
-            pstmt = txn.prepareAutoCloseStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");
-            pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
-            pstmt.setLong(2, id);
-            pstmt.setLong(3, runid);
+                int count = pstmt.executeUpdate();
+                txn.commit();
 
-            int count = pstmt.executeUpdate();
-            txn.commit();
-
-            if (count < 1) {
-                throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                if (count < 1) {
+                    throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                }
+            }catch (SQLException e) {
+                throw new 
+ CloudRuntimeException("update:Exception:"+e.getMessage(), e);
             }
-        } catch (Exception e) {
-            s_logger.warn("Unexpected exception, ", e);
-            throw new RuntimeException(e.getMessage(), e);
+        } catch (RuntimeException e) {
+            txn.rollback();
+            s_logger.warn("update:Exception:"+e.getMessage(), e);
+            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
+        }
+        finally {
+            txn.close();
         }
     }


> -----Original Message-----
> From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> Sent: Thursday, July 10, 2014 11:00 AM
> To: dev@cloudstack.apache.org
> Subject: Re: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Fixed already, good job Santosh
> 
> Cheers,
> 
> Hugo
> 
> On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> 
> > Doesn't seem related to that particular commit, the function
> invalidateRunSession appears to be dead code.
> >
> > I'll do some digging.
> >
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com>
> wrote:
> >
> >> Hi Devs,
> >>
> >> I recently noticed on master that after starting, stopping and 
> >> starting the
> management server again I get stacktraces about every second.
> >> After some digging with Daan we found that part of the
> a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The 
> problem is however deeper rooted, as the code is supposed to update 
> the mshost table, but because another Session ID (actually _runId from 
> ClusterManagerImpl.java from the previous run)  is already in there 
> the update fails. The Session ID is based on time in milliseconds, and 
> offcourse changes when you stop and start the management server again.
> >> Prior to the commit it failed silently, and has done so since the 
> >> initial
> checkin it seems. The real question is what the original idea is 
> behind the _runId, and is it something that requires fixing ?
> >>
> >> I've pasted part of the commit, sql and stacktrace below.
> >>
> >> Cheers,
> >>
> >> Funs
> >>
> >> ===
> >> mysql> select * from mshost;
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> | id | msid           | runid         | name    | state | version        | service_ip |
> service_port | last_update         | removed | alert_count |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> 1 row in set (0.00 sec)
> >>
> >> ===
> >> commit a600d8408ea86782318139c17cf346c8497943d0
> >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02
> >> 10:38:16
> >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> 04
> >> 12:47:58
> >>
> >> Fixed Resource Leaks, null dereferences, few other issues reported 
> >> by coverity
> >>
> >> -
> >>
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> m
> >> pl.java diff --git
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >>
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> index 3d0c3f5..89d7d27 100644
> >> ---
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> +++
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> >> +++ tDaoImpl.java
> >> @@ -53,15 +53,14 @@
> >>    @Override
> >>    public void invalidateRunSession(long id, long runid) {
> >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> >> -        PreparedStatement pstmt = null;
> >> -        try {
> >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> runid=0, state='Down' where id=? and runid=?");
> >> -            pstmt.setLong(1, id);
> >> -            pstmt.setLong(2, runid);
> >> -
> >> -            pstmt.executeUpdate();
> >> +        try (PreparedStatement pstmt = 
> >> + txn.prepareStatement("update
> mshost set runid=0, state='Down' where id=? and runid=?");){
> >> +            if(pstmt != null) {
> >> +                pstmt.setLong(1, id);
> >> +                pstmt.setLong(2, runid);
> >> +                pstmt.executeUpdate();
> >> +            }
> >>        } catch (SQLException e) {
> >> -            throw new CloudRuntimeException("DB exception on " +
> pstmt.toString(), e);
> >> +            throw new
> >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> >> + e.getMessage(), e);
> >>        }
> >>    }
> >>
> >> ===
> >> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster 
> >> session detected
> >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster 
> >> session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> a
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> n(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> n
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> tex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> (Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> l
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> )
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> c
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> u
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> av
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster 
> >> heartbeat
> >> java.lang.RuntimeException: update:Exception:Invalid cluster 
> >> session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:155)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> a
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflectio
> n(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoi
> n
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionCon
> tex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke
> (Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.cal
> l
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> )
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> c
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> u
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> av
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> cluster session detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >>
> >>
> >>
> >


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Funs Kessen <FK...@schubergphilis.com>.
Yeah I see that Ipasted the wrong bit of code *doh*, it was update that was the problem from the same file:

@@ -129,24 +133,29 @@
     @DB
     public void update(long id, long runid, Date lastUpdate) {
         TransactionLegacy txn = TransactionLegacy.currentTxn();
-        PreparedStatement pstmt = null;
         try {
             txn.start();
+            try( PreparedStatement pstmt = txn.prepareStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");) {
+                pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
+                pstmt.setLong(2, id);
+                pstmt.setLong(3, runid);
 
-            pstmt = txn.prepareAutoCloseStatement("update mshost set last_update=?, removed=null, alert_count=0 where id=? and runid=?");
-            pstmt.setString(1, DateUtil.getDateDisplayString(TimeZone.getTimeZone("GMT"), lastUpdate));
-            pstmt.setLong(2, id);
-            pstmt.setLong(3, runid);
+                int count = pstmt.executeUpdate();
+                txn.commit();
 
-            int count = pstmt.executeUpdate();
-            txn.commit();
-
-            if (count < 1) {
-                throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                if (count < 1) {
+                    throw new CloudRuntimeException("Invalid cluster session detected", new ClusterInvalidSessionException("runid " + runid + " is no longer valid"));
+                }
+            }catch (SQLException e) {
+                throw new CloudRuntimeException("update:Exception:"+e.getMessage(), e);
             }
-        } catch (Exception e) {
-            s_logger.warn("Unexpected exception, ", e);
-            throw new RuntimeException(e.getMessage(), e);
+        } catch (RuntimeException e) {
+            txn.rollback();
+            s_logger.warn("update:Exception:"+e.getMessage(), e);
+            throw new RuntimeException("update:Exception:"+e.getMessage(), e);
+        }
+        finally {
+            txn.close();
         }
     }


> -----Original Message-----
> From: Trippie [mailto:trippie@gmail.com] On Behalf Of Hugo Trippaers
> Sent: Thursday, July 10, 2014 11:00 AM
> To: dev@cloudstack.apache.org
> Subject: Re: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Fixed already, good job Santosh
> 
> Cheers,
> 
> Hugo
> 
> On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:
> 
> > Doesn't seem related to that particular commit, the function
> invalidateRunSession appears to be dead code.
> >
> > I'll do some digging.
> >
> >
> > Cheers,
> >
> > Hugo
> >
> > On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com>
> wrote:
> >
> >> Hi Devs,
> >>
> >> I recently noticed on master that after starting, stopping and starting the
> management server again I get stacktraces about every second.
> >> After some digging with Daan we found that part of the
> a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The
> problem is however deeper rooted, as the code is supposed to update the
> mshost table, but because another Session ID (actually _runId from
> ClusterManagerImpl.java from the previous run)  is already in there the
> update fails. The Session ID is based on time in milliseconds, and offcourse
> changes when you stop and start the management server again.
> >> Prior to the commit it failed silently, and has done so since the initial
> checkin it seems. The real question is what the original idea is behind the
> _runId, and is it something that requires fixing ?
> >>
> >> I've pasted part of the commit, sql and stacktrace below.
> >>
> >> Cheers,
> >>
> >> Funs
> >>
> >> ===
> >> mysql> select * from mshost;
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> | id | msid           | runid         | name    | state | version        | service_ip |
> service_port | last_update         | removed | alert_count |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
> >> +----+----------------+---------------+---------+-------+----------------+------------+----------
> ----+---------------------+---------+-------------+
> >> 1 row in set (0.00 sec)
> >>
> >> ===
> >> commit a600d8408ea86782318139c17cf346c8497943d0
> >> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02
> >> 10:38:16
> >> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-
> 04
> >> 12:47:58
> >>
> >> Fixed Resource Leaks, null dereferences, few other issues reported by
> >> coverity
> >>
> >> -
> >>
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> m
> >> pl.java diff --git
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >>
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> index 3d0c3f5..89d7d27 100644
> >> ---
> >>
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> o
> >> Impl.java
> >> +++
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHos
> >> +++ tDaoImpl.java
> >> @@ -53,15 +53,14 @@
> >>    @Override
> >>    public void invalidateRunSession(long id, long runid) {
> >>        TransactionLegacy txn = TransactionLegacy.currentTxn();
> >> -        PreparedStatement pstmt = null;
> >> -        try {
> >> -            pstmt = txn.prepareAutoCloseStatement("update mshost set
> runid=0, state='Down' where id=? and runid=?");
> >> -            pstmt.setLong(1, id);
> >> -            pstmt.setLong(2, runid);
> >> -
> >> -            pstmt.executeUpdate();
> >> +        try (PreparedStatement pstmt = txn.prepareStatement("update
> mshost set runid=0, state='Down' where id=? and runid=?");){
> >> +            if(pstmt != null) {
> >> +                pstmt.setLong(1, id);
> >> +                pstmt.setLong(2, runid);
> >> +                pstmt.executeUpdate();
> >> +            }
> >>        } catch (SQLException e) {
> >> -            throw new CloudRuntimeException("DB exception on " +
> pstmt.toString(), e);
> >> +            throw new
> >> + CloudRuntimeException("invalidateRunSession:Exception:"+
> >> + e.getMessage(), e);
> >>        }
> >>    }
> >>
> >> ===
> >> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster
> >> session detected
> >> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster
> >> heartbeat
> >> java.lang.RuntimeException: update:Exception:Invalid cluster session
> detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:155)
> >>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> va:57)
> >>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> >>       at java.lang.reflect.Method.invoke(Method.java:606)
> >>       at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(
> AopUtils.java:317)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin
> point(ReflectiveMethodInvocation.java:183)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
> >>       at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex
> tInterceptor.java:34)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
> >>       at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex
> poseInvocationInterceptor.java:91)
> >>       at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
> >>       at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
> >>       at com.sun.proxy.$Proxy158.update(Unknown Source)
> >>       at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call
> (DefaultManagedContext.java:56)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
> >>       at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
> >>       at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
> >>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac
> cess$301(ScheduledThreadPoolExecutor.java:178)
> >>       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru
> n(ScheduledThreadPoolExecutor.java:293)
> >>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
> >>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
> >>       at java.lang.Thread.run(Thread.java:744)
> >> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> cluster session detected
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
> >>       at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
> >>       ... 26 more
> >>
> >>
> >>
> >


Re: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Hugo Trippaers <hu...@trippaers.nl>.
Fixed already, good job Santosh

Cheers,

Hugo

On 10 jul. 2014, at 10:58, Hugo Trippaers <tr...@gmail.com> wrote:

> Doesn’t seem related to that particular commit, the function invalidateRunSession appears to be dead code.
> 
> I’ll do some digging.
> 
> 
> Cheers,
> 
> Hugo
> 
> On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com> wrote:
> 
>> Hi Devs,
>> 
>> I recently noticed on master that after starting, stopping and starting the management server again I get stacktraces about every second.
>> After some digging with Daan we found that part of the a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The problem is however deeper rooted, as the code is supposed to update the mshost table, but because another Session ID (actually _runId from ClusterManagerImpl.java from the previous run)  is already in there the update fails. The Session ID is based on time in milliseconds, and offcourse changes when you stop and start the management server again.
>> Prior to the commit it failed silently, and has done so since the initial checkin it seems. The real question is what the original idea is behind the _runId, and is it something that requires fixing ?
>> 
>> I've pasted part of the commit, sql and stacktrace below.
>> 
>> Cheers,
>> 
>> Funs
>> 
>> ===
>> mysql> select * from mshost;
>> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
>> | id | msid           | runid         | name    | state | version        | service_ip | service_port | last_update         | removed | alert_count |
>> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
>> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
>> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
>> 1 row in set (0.00 sec)
>> 
>> ===
>> commit a600d8408ea86782318139c17cf346c8497943d0
>> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02 10:38:16
>> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-04 12:47:58
>> 
>> Fixed Resource Leaks, null dereferences, few other issues reported by coverity
>> 
>> - framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java 
>> diff --git a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
>> index 3d0c3f5..89d7d27 100644
>> --- a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
>> +++ b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
>> @@ -53,15 +53,14 @@
>>    @Override
>>    public void invalidateRunSession(long id, long runid) {
>>        TransactionLegacy txn = TransactionLegacy.currentTxn();
>> -        PreparedStatement pstmt = null;
>> -        try {
>> -            pstmt = txn.prepareAutoCloseStatement("update mshost set runid=0, state='Down' where id=? and runid=?");
>> -            pstmt.setLong(1, id);
>> -            pstmt.setLong(2, runid);
>> -
>> -            pstmt.executeUpdate();
>> +        try (PreparedStatement pstmt = txn.prepareStatement("update mshost set runid=0, state='Down' where id=? and runid=?");){
>> +            if(pstmt != null) {
>> +                pstmt.setLong(1, id);
>> +                pstmt.setLong(2, runid);
>> +                pstmt.executeUpdate();
>> +            }
>>        } catch (SQLException e) {
>> -            throw new CloudRuntimeException("DB exception on " + pstmt.toString(), e);
>> +            throw new CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(), e);
>>        }
>>    }
>> 
>> ===
>> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster session detected
>> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
>>       at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>       at java.lang.reflect.Method.invoke(Method.java:606)
>>       at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>>       at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>>       at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>       at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>>       at com.sun.proxy.$Proxy158.update(Unknown Source)
>>       at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
>>       at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>       at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>       at java.lang.Thread.run(Thread.java:744)
>> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
>>       at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>>       ... 26 more
>> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat
>> java.lang.RuntimeException: update:Exception:Invalid cluster session detected
>>       at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:155)
>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>       at java.lang.reflect.Method.invoke(Method.java:606)
>>       at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>>       at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>>       at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>>       at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>       at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>>       at com.sun.proxy.$Proxy158.update(Unknown Source)
>>       at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
>>       at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>       at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>       at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>       at java.lang.Thread.run(Thread.java:744)
>> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
>>       at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>>       ... 26 more
>> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
>>       at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>>       ... 26 more
>> 
>> 
>> 
> 


Re: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Hugo Trippaers <hu...@trippaers.nl>.
Doesn’t seem related to that particular commit, the function invalidateRunSession appears to be dead code.

I’ll do some digging.


Cheers,

Hugo

On 10 jul. 2014, at 10:40, Funs Kessen <FK...@schubergphilis.com> wrote:

> Hi Devs,
> 
> I recently noticed on master that after starting, stopping and starting the management server again I get stacktraces about every second.
> After some digging with Daan we found that part of the a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The problem is however deeper rooted, as the code is supposed to update the mshost table, but because another Session ID (actually _runId from ClusterManagerImpl.java from the previous run)  is already in there the update fails. The Session ID is based on time in milliseconds, and offcourse changes when you stop and start the management server again.
> Prior to the commit it failed silently, and has done so since the initial checkin it seems. The real question is what the original idea is behind the _runId, and is it something that requires fixing ?
> 
> I've pasted part of the commit, sql and stacktrace below.
> 
> Cheers,
> 
> Funs
> 
> ===
> mysql> select * from mshost;
> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
> | id | msid           | runid         | name    | state | version        | service_ip | service_port | last_update         | removed | alert_count |
> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
> +----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
> 1 row in set (0.00 sec)
> 
> ===
> commit a600d8408ea86782318139c17cf346c8497943d0
> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02 10:38:16
> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-04 12:47:58
> 
> Fixed Resource Leaks, null dereferences, few other issues reported by coverity
> 
> - framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java 
> diff --git a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
> index 3d0c3f5..89d7d27 100644
> --- a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
> +++ b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
> @@ -53,15 +53,14 @@
>     @Override
>     public void invalidateRunSession(long id, long runid) {
>         TransactionLegacy txn = TransactionLegacy.currentTxn();
> -        PreparedStatement pstmt = null;
> -        try {
> -            pstmt = txn.prepareAutoCloseStatement("update mshost set runid=0, state='Down' where id=? and runid=?");
> -            pstmt.setLong(1, id);
> -            pstmt.setLong(2, runid);
> -
> -            pstmt.executeUpdate();
> +        try (PreparedStatement pstmt = txn.prepareStatement("update mshost set runid=0, state='Down' where id=? and runid=?");){
> +            if(pstmt != null) {
> +                pstmt.setLong(1, id);
> +                pstmt.setLong(2, runid);
> +                pstmt.executeUpdate();
> +            }
>         } catch (SQLException e) {
> -            throw new CloudRuntimeException("DB exception on " + pstmt.toString(), e);
> +            throw new CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(), e);
>         }
>     }
> 
> ===
> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster session detected
> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
>        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:606)
>        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>        at com.sun.proxy.$Proxy158.update(Unknown Source)
>        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
>        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>        at java.lang.Thread.run(Thread.java:744)
> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
>        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>        ... 26 more
> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat
> java.lang.RuntimeException: update:Exception:Invalid cluster session detected
>        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:155)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:606)
>        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>        at com.sun.proxy.$Proxy158.update(Unknown Source)
>        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
>        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>        at java.lang.Thread.run(Thread.java:744)
> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
>        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>        ... 26 more
> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
>        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
>        ... 26 more
> 
> 
> 


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Funs Kessen <FK...@schubergphilis.com>.
Hi Likitha,

Ah cool thanks!

Cheers,

Funs

> -----Original Message-----
> From: Likitha Shetty [mailto:likitha.shetty@citrix.com]
> Sent: Thursday, July 10, 2014 10:50 AM
> To: dev@cloudstack.apache.org
> Cc: Daan Hoogland
> Subject: RE: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Funs, Santhosh fixed this problem yesterday.
> https://issues.apache.org/jira/browse/CLOUDSTACK-7079 has more details.
> 
> Thanks,
> Likitha
> 
> -----Original Message-----
> From: Funs Kessen [mailto:FKessen@schubergphilis.com]
> Sent: Thursday, July 10, 2014 2:10 PM
> To: dev@cloudstack.apache.org
> Cc: Daan Hoogland
> Subject: "Bug" introduced by part of commit
> a600d8408ea86782318139c17cf346c8497943d0
> 
> Hi Devs,
> 
> I recently noticed on master that after starting, stopping and starting the
> management server again I get stacktraces about every second.
> After some digging with Daan we found that part of the
> a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The
> problem is however deeper rooted, as the code is supposed to update the
> mshost table, but because another Session ID (actually _runId from
> ClusterManagerImpl.java from the previous run)  is already in there the
> update fails. The Session ID is based on time in milliseconds, and offcourse
> changes when you stop and start the management server again.
> Prior to the commit it failed silently, and has done so since the initial
> checkin it seems. The real question is what the original idea is behind the
> _runId, and is it something that requires fixing ?
> 
> I've pasted part of the commit, sql and stacktrace below.
> 
> Cheers,
> 
> Funs
> 
> ===
> mysql> select * from mshost;
> +----+----------------+---------------+---------+-------+----------------+------------+--------------
> +---------------------+---------+-------------+
> | id | msid           | runid         | name    | state | version        | service_ip |
> service_port | last_update         | removed | alert_count |
> +----+----------------+---------------+---------+-------+----------------+------------+--------------
> +---------------------+---------+-------------+
> |  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-
> SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
> +----+----------------+---------------+---------+-------+----------------+------------+--------------
> +---------------------+---------+-------------+
> 1 row in set (0.00 sec)
> 
> ===
> commit a600d8408ea86782318139c17cf346c8497943d0
> Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02
> 10:38:16
> Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-04
> 12:47:58
> 
> Fixed Resource Leaks, null dereferences, few other issues reported by
> coverity
> 
> -
> framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoI
> mpl.java
> diff --git
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> oImpl.java
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> oImpl.java
> index 3d0c3f5..89d7d27 100644
> ---
> a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> oImpl.java
> +++
> b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDa
> oImpl.java
> @@ -53,15 +53,14 @@
>      @Override
>      public void invalidateRunSession(long id, long runid) {
>          TransactionLegacy txn = TransactionLegacy.currentTxn();
> -        PreparedStatement pstmt = null;
> -        try {
> -            pstmt = txn.prepareAutoCloseStatement("update mshost set runid=0,
> state='Down' where id=? and runid=?");
> -            pstmt.setLong(1, id);
> -            pstmt.setLong(2, runid);
> -
> -            pstmt.executeUpdate();
> +        try (PreparedStatement pstmt = txn.prepareStatement("update mshost
> set runid=0, state='Down' where id=? and runid=?");){
> +            if(pstmt != null) {
> +                pstmt.setLong(1, id);
> +                pstmt.setLong(2, runid);
> +                pstmt.executeUpdate();
> +            }
>          } catch (SQLException e) {
> -            throw new CloudRuntimeException("DB exception on " +
> pstmt.toString(), e);
> +            throw new
> CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(),
> e);
>          }
>      }
> 
> ===
> 2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl]
> (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster session
> detected
> com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session
> detected
>         at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> va:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(
> AopUtils.java:317)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin
> point(ReflectiveMethodInvocation.java:183)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
>         at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex
> tInterceptor.java:34)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
>         at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex
> poseInvocationInterceptor.java:91)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
>         at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
>         at com.sun.proxy.$Proxy158.update(Unknown Source)
>         at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
>         at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call
> (DefaultManagedContext.java:56)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
>         at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac
> cess$301(ScheduledThreadPoolExecutor.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru
> n(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
>         at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
>         ... 26 more
> 2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster-
> Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat
> java.lang.RuntimeException: update:Exception:Invalid cluster session
> detected
>         at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:155)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> va:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(
> AopUtils.java:317)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoin
> point(ReflectiveMethodInvocation.java:183)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:150)
>         at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContex
> tInterceptor.java:34)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:161)
>         at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(Ex
> poseInvocationInterceptor.java:91)
>         at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(R
> eflectiveMethodInvocation.java:172)
>         at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynam
> icAopProxy.java:204)
>         at com.sun.proxy.$Proxy158.update(Unknown Source)
>         at
> com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl
> .java:545)
>         at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(M
> anagedContextRunnable.java:49)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call
> (DefaultManagedContext.java:56)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callW
> ithContext(DefaultManagedContext.java:103)
>         at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWi
> thContext(DefaultManagedContext.java:53)
>         at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(Man
> agedContextRunnable.java:46)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ac
> cess$301(ScheduledThreadPoolExecutor.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.ru
> n(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid
> cluster session detected
>         at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
>         ... 26 more
> Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid
> 1404924278923 is no longer valid
>         at
> com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(Management
> ServerHostDaoImpl.java:147)
>         ... 26 more
> 
> 


RE: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Posted by Likitha Shetty <li...@citrix.com>.
Funs, Santhosh fixed this problem yesterday. https://issues.apache.org/jira/browse/CLOUDSTACK-7079 has more details.

Thanks,
Likitha

-----Original Message-----
From: Funs Kessen [mailto:FKessen@schubergphilis.com] 
Sent: Thursday, July 10, 2014 2:10 PM
To: dev@cloudstack.apache.org
Cc: Daan Hoogland
Subject: "Bug" introduced by part of commit a600d8408ea86782318139c17cf346c8497943d0

Hi Devs,

I recently noticed on master that after starting, stopping and starting the management server again I get stacktraces about every second.
After some digging with Daan we found that part of the a600d8408ea86782318139c17cf346c84979943d0 commit causes this. The problem is however deeper rooted, as the code is supposed to update the mshost table, but because another Session ID (actually _runId from ClusterManagerImpl.java from the previous run)  is already in there the update fails. The Session ID is based on time in milliseconds, and offcourse changes when you stop and start the management server again.
Prior to the commit it failed silently, and has done so since the initial checkin it seems. The real question is what the original idea is behind the _runId, and is it something that requires fixing ?

I've pasted part of the commit, sql and stacktrace below.

Cheers,

Funs

===
mysql> select * from mshost;
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
| id | msid           | runid         | name    | state | version        | service_ip | service_port | last_update         | removed | alert_count |
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
|  1 | 90520734207775 | 1404924979461 | cs-mgmt | Up    | 4.5.0-SNAPSHOT | 127.0.0.1  |         9090 | 2014-07-09 16:58:07 | NULL    |           0 |
+----+----------------+---------------+---------+-------+----------------+------------+--------------+---------------------+---------+-------------+
1 row in set (0.00 sec)

===
commit a600d8408ea86782318139c17cf346c8497943d0
Author: Santhosh Edukulla <sa...@gmail.com> 2014-07-02 10:38:16
Committer: Santhosh Edukulla <sa...@gmail.com> 2014-07-04 12:47:58

Fixed Resource Leaks, null dereferences, few other issues reported by coverity

- framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java 
diff --git a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
index 3d0c3f5..89d7d27 100644
--- a/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
+++ b/framework/cluster/src/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java
@@ -53,15 +53,14 @@
     @Override
     public void invalidateRunSession(long id, long runid) {
         TransactionLegacy txn = TransactionLegacy.currentTxn();
-        PreparedStatement pstmt = null;
-        try {
-            pstmt = txn.prepareAutoCloseStatement("update mshost set runid=0, state='Down' where id=? and runid=?");
-            pstmt.setLong(1, id);
-            pstmt.setLong(2, runid);
-
-            pstmt.executeUpdate();
+        try (PreparedStatement pstmt = txn.prepareStatement("update mshost set runid=0, state='Down' where id=? and runid=?");){
+            if(pstmt != null) {
+                pstmt.setLong(1, id);
+                pstmt.setLong(2, runid);
+                pstmt.executeUpdate();
+            }
         } catch (SQLException e) {
-            throw new CloudRuntimeException("DB exception on " + pstmt.toString(), e);
+            throw new CloudRuntimeException("invalidateRunSession:Exception:"+ e.getMessage(), e);
         }
     }

===
2014-07-09 18:45:06,541 WARN  [c.c.c.d.ManagementServerHostDaoImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) update:Exception:Invalid cluster session detected
com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy158.update(Unknown Source)
        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more
2014-07-09 18:45:06,543 ERROR [c.c.c.ClusterManagerImpl] (Cluster-Heartbeat-1:ctx-5f2f8ad5) Unexpected exception in cluster heartbeat
java.lang.RuntimeException: update:Exception:Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:155)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy158.update(Unknown Source)
        at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:545)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: com.cloud.utils.exception.CloudRuntimeException: Invalid cluster session detected
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more
Caused by: com.cloud.cluster.ClusterInvalidSessionException: runid 1404924278923 is no longer valid
        at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:147)
        ... 26 more