You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mikhail Mazursky (JIRA)" <ji...@apache.org> on 2013/10/31 12:15:18 UTC

[jira] [Created] (CASSANDRA-6275) Paxos leaks file handles

Mikhail Mazursky created CASSANDRA-6275:
-------------------------------------------

             Summary: Paxos leaks file handles
                 Key: CASSANDRA-6275
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Mikhail Mazursky
            Priority: Blocker


Looks like C* is leaking file descriptors when doing lots of CAS operations.

{noformat}
$ sudo cat /proc/15455/limits
Limit                     Soft Limit           Hard Limit           Units    
Max cpu time              unlimited            unlimited            seconds  
Max file size             unlimited            unlimited            bytes    
Max data size             unlimited            unlimited            bytes    
Max stack size            10485760             unlimited            bytes    
Max core file size        0                    0                    bytes    
Max resident set          unlimited            unlimited            bytes    
Max processes             1024                 unlimited            processes
Max open files            4096                 4096                 files    
Max locked memory         unlimited            unlimited            bytes    
Max address space         unlimited            unlimited            bytes    
Max file locks            unlimited            unlimited            locks    
Max pending signals       14633                14633                signals  
Max msgqueue size         819200               819200               bytes    
Max nice priority         0                    0                   
Max realtime priority     0                    0                   
Max realtime timeout      unlimited            unlimited            us 
{noformat}

Looks like the problem is not in limits.

Before load test:
{noformat}
cassandra-test0 ~]$ lsof -n | grep java | wc -l
166

cassandra-test1 ~]$ lsof -n | grep java | wc -l
164

cassandra-test2 ~]$ lsof -n | grep java | wc -l
180
{noformat}

After load test:
{noformat}
cassandra-test0 ~]$ lsof -n | grep java | wc -l
967

cassandra-test1 ~]$ lsof -n | grep java | wc -l
1766

cassandra-test2 ~]$ lsof -n | grep java | wc -l
2578
{noformat}

Most opened files have names like:
{noformat}
java      16890 cassandra 1636r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1637r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1638r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1639r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1640r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1641r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1642r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1643r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1644r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1645r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1646r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1647r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1648r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1649r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1650r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1651r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1652r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1653r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1654r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
java      16890 cassandra 1655r      REG             202,17 161158485     655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
java      16890 cassandra 1656r      REG             202,17  88724987     655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
{noformat}

Also, when that happens it's not always possible to shutdown server process via SIGTERM. Have to use SIGKILL.

p.s. See mailing thread for more context information https://www.mail-archive.com/user@cassandra.apache.org/msg33035.html



--
This message was sent by Atlassian JIRA
(v6.1#6144)