You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Florian Dambrine <fl...@gumgum.com> on 2014/06/04 19:15:02 UTC
Too Many Open Files (sockets) - VNodes - Map/Reduce Job
Hi every body,
We are running ElasticMapReduce Jobs from Amazon on a 25 nodes Cassandra
cluster (with VNodes). Since we have increased the size of the cluster we
are facing a too many open files (due to sockets) exception when creating
the splits. Does anyone has an idea?
Thanks,
Here is the stacktrace:
14/06/04 03:23:24 INFO mapred.JobClient: Default number of map tasks: null
14/06/04 03:23:24 INFO mapred.JobClient: Setting default number of map
tasks based on cluster size to : 80
14/06/04 03:23:24 INFO mapred.JobClient: Default number of reduce tasks: 26
14/06/04 03:23:25 INFO security.ShellBasedUnixGroupsMapping: add
hadoop to shell userGroupsCache
14/06/04 03:23:25 INFO mapred.JobClient: Setting group to hadoop
14/06/04 03:23:41 ERROR transport.TSocket: Could not configure socket.
java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:447)
at java.net.Socket.getImpl(Socket.java:510)
at java.net.Socket.setSoLinger(Socket.java:984)
at org.apache.thrift.transport.TSocket.initSocket(TSocket.java:118)
at org.apache.thrift.transport.TSocket.<init>(TSocket.java:109)
at org.apache.thrift.transport.TSocket.<init>(TSocket.java:94)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:39)
at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:558)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:286)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
--
*Florian Dambrine* | Intern, Big Data
*GumGum* <http://www.gumgum.com/> | *Ads that stick*
209-797-3994 | florian@gumgum.com <va...@gumgum.com>
Re: Too Many Open Files (sockets) - VNodes - Map/Reduce Job
Posted by Michael Shuler <mi...@pbandjelly.org>.
(this is probably a better question for the user list - cc/reply-to set)
Allow more files to be open :)
http://www.datastax.com/documentation/cassandra/1.2/cassandra/install/installRecommendSettings.html
--
Kind regards,
Michael
On 06/04/2014 12:15 PM, Florian Dambrine wrote:
> Hi every body,
>
> We are running ElasticMapReduce Jobs from Amazon on a 25 nodes Cassandra
> cluster (with VNodes). Since we have increased the size of the cluster we
> are facing a too many open files (due to sockets) exception when creating
> the splits. Does anyone has an idea?
>
> Thanks,
>
> Here is the stacktrace:
>
>
> 14/06/04 03:23:24 INFO mapred.JobClient: Default number of map tasks: null
> 14/06/04 03:23:24 INFO mapred.JobClient: Setting default number of map
> tasks based on cluster size to : 80
> 14/06/04 03:23:24 INFO mapred.JobClient: Default number of reduce tasks: 26
> 14/06/04 03:23:25 INFO security.ShellBasedUnixGroupsMapping: add
> hadoop to shell userGroupsCache
> 14/06/04 03:23:25 INFO mapred.JobClient: Setting group to hadoop
> 14/06/04 03:23:41 ERROR transport.TSocket: Could not configure socket.
> java.net.SocketException: Too many open files
> at java.net.Socket.createImpl(Socket.java:447)
> at java.net.Socket.getImpl(Socket.java:510)
> at java.net.Socket.setSoLinger(Socket.java:984)
> at org.apache.thrift.transport.TSocket.initSocket(TSocket.java:118)
> at org.apache.thrift.transport.TSocket.<init>(TSocket.java:109)
> at org.apache.thrift.transport.TSocket.<init>(TSocket.java:94)
> at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:39)
> at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:558)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:286)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
Re: Too Many Open Files (sockets) - VNodes - Map/Reduce Job
Posted by Michael Shuler <mi...@pbandjelly.org>.
(this is probably a better question for the user list - cc/reply-to set)
Allow more files to be open :)
http://www.datastax.com/documentation/cassandra/1.2/cassandra/install/installRecommendSettings.html
--
Kind regards,
Michael
On 06/04/2014 12:15 PM, Florian Dambrine wrote:
> Hi every body,
>
> We are running ElasticMapReduce Jobs from Amazon on a 25 nodes Cassandra
> cluster (with VNodes). Since we have increased the size of the cluster we
> are facing a too many open files (due to sockets) exception when creating
> the splits. Does anyone has an idea?
>
> Thanks,
>
> Here is the stacktrace:
>
>
> 14/06/04 03:23:24 INFO mapred.JobClient: Default number of map tasks: null
> 14/06/04 03:23:24 INFO mapred.JobClient: Setting default number of map
> tasks based on cluster size to : 80
> 14/06/04 03:23:24 INFO mapred.JobClient: Default number of reduce tasks: 26
> 14/06/04 03:23:25 INFO security.ShellBasedUnixGroupsMapping: add
> hadoop to shell userGroupsCache
> 14/06/04 03:23:25 INFO mapred.JobClient: Setting group to hadoop
> 14/06/04 03:23:41 ERROR transport.TSocket: Could not configure socket.
> java.net.SocketException: Too many open files
> at java.net.Socket.createImpl(Socket.java:447)
> at java.net.Socket.getImpl(Socket.java:510)
> at java.net.Socket.setSoLinger(Socket.java:984)
> at org.apache.thrift.transport.TSocket.initSocket(TSocket.java:118)
> at org.apache.thrift.transport.TSocket.<init>(TSocket.java:109)
> at org.apache.thrift.transport.TSocket.<init>(TSocket.java:94)
> at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:39)
> at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:558)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:286)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
> at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
>
>