You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Steve Miller <st...@idrathernotsay.com> on 2016/08/01 07:49:37 UTC

Re: Too Many Open Files

Can you run lsof -p (pid) for whatever the pid is for your Kafka process?

For the fd limits you've set, I don't think subtlety is required: if there's a millionish lines in the output, the fd limit you set is where you think it is, and if it's a lot lower than that, the limit isn't being applied properly somehow (maybe you are running this under, say, supervisord, and maybe its config is lowering the limit, or the limits for root are as you say but the limits for the kafka user aren't being set properly, that sort of thing).

If you do have 1M lines in the output, at least this might give you a place to start figuring out what's open and why.

    -Steve

> On Jul 31, 2016, at 4:14 PM, Kessiler Rodrigues <ke...@callinize.com> wrote:
> 
> I’m still experiencing this issue…
> 
> Here are the kafka logs.
> 
> [2016-07-31 20:10:35,658] ERROR Error while accepting connection (kafka.network.Acceptor)
> java.io.IOException: Too many open files
>    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>    at kafka.network.Acceptor.accept(SocketServer.scala:323)
>    at kafka.network.Acceptor.run(SocketServer.scala:268)
>    at java.lang.Thread.run(Thread.java:745)
> [2016-07-31 20:10:35,658] ERROR Error while accepting connection (kafka.network.Acceptor)
> java.io.IOException: Too many open files
>    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>    at kafka.network.Acceptor.accept(SocketServer.scala:323)
>    at kafka.network.Acceptor.run(SocketServer.scala:268)
>    at java.lang.Thread.run(Thread.java:745)
> [2016-07-31 20:10:35,658] ERROR Error while accepting connection (kafka.network.Acceptor)
> java.io.IOException: Too many open files
>    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>    at kafka.network.Acceptor.accept(SocketServer.scala:323)
>    at kafka.network.Acceptor.run(SocketServer.scala:268)
>    at java.lang.Thread.run(Thread.java:745)
> 
> My ulimit is 1 million, how is that possible?
> 
> Can someone help with this? 
> 
> 
>> On Jul 30, 2016, at 5:05 AM, Kessiler Rodrigues <ke...@callinize.com> wrote:
>> 
>> I have changed it a bit.
>> 
>> I have 10 brokers and 20k topics with 1 partition each. 
>> 
>> I looked at the kaka’s logs dir and I only have 3318 files.
>> 
>> I’m doing some tests to see how many topics/partitions I can have, but it is throwing too many files once it hits 15k topics..
>> 
>> Any thoughts?
>> 
>> 
>> 
>>> On Jul 29, 2016, at 10:33 PM, Gwen Shapira <gw...@confluent.io> wrote:
>>> 
>>> woah, it looks like you have 15,000 replicas per broker?
>>> 
>>> You can go into the directory you configured for kafka's log.dir and
>>> see how many files you have there. Depending on your segment size and
>>> retention policy, you could have hundreds of files per partition
>>> there...
>>> 
>>> Make sure you have at least that many file handles and then also add
>>> handles for the client connections.
>>> 
>>> 1 million file handles sound like a lot, but you are running lots of
>>> partitions per broker...
>>> 
>>> We normally don't see more than maybe 4000 per broker and most
>>> clusters have a lot fewer, so consider adding brokers and spreading
>>> partitions around a bit.
>>> 
>>> Gwen
>>> 
>>> On Fri, Jul 29, 2016 at 12:00 PM, Kessiler Rodrigues
>>> <ke...@callinize.com> wrote:
>>>> Hi guys,
>>>> 
>>>> I have been experiencing some issues on kafka, where its throwing too many open files.
>>>> 
>>>> I have around of 6k topics and 5 partitions each.
>>>> 
>>>> My cluster was made with 6 brokers. All of them are running Ubuntu 16 and the file limits settings are:
>>>> 
>>>> `cat  /proc/sys/fs/file-max`
>>>> 2000000
>>>> 
>>>> `ulimit -n`
>>>> 1000000
>>>> 
>>>> Anyone has experienced it before?
> 


Re: Too Many Open Files

Posted by Kessiler Rodrigues <ke...@callinize.com>.
Hey guys

I got a solution for this. The kafka process wasn’t getting the limits config because I was running it under supervisor.

I changed it and right now I’m using systemd to put kafka up and running!

On systemd services you can setup your FD limit using a property called “LimitNOFile”.

Thanks for all your help!


> On Aug 1, 2016, at 5:04 AM, Anirudh P <pa...@gmail.com> wrote:
> 
> I agree with Steve. We had a similar problem where we set the ulimit to a
> certain value but it was getting overridden.
> It only worked when we set the ulimit after logging in as root. You might
> want to give that a try if you have not done so already
> 
> - Anirudh
> 
> On Mon, Aug 1, 2016 at 1:19 PM, Steve Miller <st...@idrathernotsay.com>
> wrote:
> 
>> Can you run lsof -p (pid) for whatever the pid is for your Kafka process?
>> 
>> For the fd limits you've set, I don't think subtlety is required: if
>> there's a millionish lines in the output, the fd limit you set is where you
>> think it is, and if it's a lot lower than that, the limit isn't being
>> applied properly somehow (maybe you are running this under, say,
>> supervisord, and maybe its config is lowering the limit, or the limits for
>> root are as you say but the limits for the kafka user aren't being set
>> properly, that sort of thing).
>> 
>> If you do have 1M lines in the output, at least this might give you a
>> place to start figuring out what's open and why.
>> 
>>    -Steve
>> 
>>> On Jul 31, 2016, at 4:14 PM, Kessiler Rodrigues <ke...@callinize.com>
>> wrote:
>>> 
>>> I’m still experiencing this issue…
>>> 
>>> Here are the kafka logs.
>>> 
>>> [2016-07-31 20:10:35,658] ERROR Error while accepting connection
>> (kafka.network.Acceptor)
>>> java.io.IOException: Too many open files
>>>   at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>>>   at kafka.network.Acceptor.accept(SocketServer.scala:323)
>>>   at kafka.network.Acceptor.run(SocketServer.scala:268)
>>>   at java.lang.Thread.run(Thread.java:745)
>>> [2016-07-31 20:10:35,658] ERROR Error while accepting connection
>> (kafka.network.Acceptor)
>>> java.io.IOException: Too many open files
>>>   at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>>>   at kafka.network.Acceptor.accept(SocketServer.scala:323)
>>>   at kafka.network.Acceptor.run(SocketServer.scala:268)
>>>   at java.lang.Thread.run(Thread.java:745)
>>> [2016-07-31 20:10:35,658] ERROR Error while accepting connection
>> (kafka.network.Acceptor)
>>> java.io.IOException: Too many open files
>>>   at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>>>   at
>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>>>   at kafka.network.Acceptor.accept(SocketServer.scala:323)
>>>   at kafka.network.Acceptor.run(SocketServer.scala:268)
>>>   at java.lang.Thread.run(Thread.java:745)
>>> 
>>> My ulimit is 1 million, how is that possible?
>>> 
>>> Can someone help with this?
>>> 
>>> 
>>>> On Jul 30, 2016, at 5:05 AM, Kessiler Rodrigues <ke...@callinize.com>
>> wrote:
>>>> 
>>>> I have changed it a bit.
>>>> 
>>>> I have 10 brokers and 20k topics with 1 partition each.
>>>> 
>>>> I looked at the kaka’s logs dir and I only have 3318 files.
>>>> 
>>>> I’m doing some tests to see how many topics/partitions I can have, but
>> it is throwing too many files once it hits 15k topics..
>>>> 
>>>> Any thoughts?
>>>> 
>>>> 
>>>> 
>>>>> On Jul 29, 2016, at 10:33 PM, Gwen Shapira <gw...@confluent.io> wrote:
>>>>> 
>>>>> woah, it looks like you have 15,000 replicas per broker?
>>>>> 
>>>>> You can go into the directory you configured for kafka's log.dir and
>>>>> see how many files you have there. Depending on your segment size and
>>>>> retention policy, you could have hundreds of files per partition
>>>>> there...
>>>>> 
>>>>> Make sure you have at least that many file handles and then also add
>>>>> handles for the client connections.
>>>>> 
>>>>> 1 million file handles sound like a lot, but you are running lots of
>>>>> partitions per broker...
>>>>> 
>>>>> We normally don't see more than maybe 4000 per broker and most
>>>>> clusters have a lot fewer, so consider adding brokers and spreading
>>>>> partitions around a bit.
>>>>> 
>>>>> Gwen
>>>>> 
>>>>> On Fri, Jul 29, 2016 at 12:00 PM, Kessiler Rodrigues
>>>>> <ke...@callinize.com> wrote:
>>>>>> Hi guys,
>>>>>> 
>>>>>> I have been experiencing some issues on kafka, where its throwing too
>> many open files.
>>>>>> 
>>>>>> I have around of 6k topics and 5 partitions each.
>>>>>> 
>>>>>> My cluster was made with 6 brokers. All of them are running Ubuntu 16
>> and the file limits settings are:
>>>>>> 
>>>>>> `cat  /proc/sys/fs/file-max`
>>>>>> 2000000
>>>>>> 
>>>>>> `ulimit -n`
>>>>>> 1000000
>>>>>> 
>>>>>> Anyone has experienced it before?
>>> 
>> 
>> 


Re: Too Many Open Files

Posted by Anirudh P <pa...@gmail.com>.
I agree with Steve. We had a similar problem where we set the ulimit to a
certain value but it was getting overridden.
It only worked when we set the ulimit after logging in as root. You might
want to give that a try if you have not done so already

- Anirudh

On Mon, Aug 1, 2016 at 1:19 PM, Steve Miller <st...@idrathernotsay.com>
wrote:

> Can you run lsof -p (pid) for whatever the pid is for your Kafka process?
>
> For the fd limits you've set, I don't think subtlety is required: if
> there's a millionish lines in the output, the fd limit you set is where you
> think it is, and if it's a lot lower than that, the limit isn't being
> applied properly somehow (maybe you are running this under, say,
> supervisord, and maybe its config is lowering the limit, or the limits for
> root are as you say but the limits for the kafka user aren't being set
> properly, that sort of thing).
>
> If you do have 1M lines in the output, at least this might give you a
> place to start figuring out what's open and why.
>
>     -Steve
>
> > On Jul 31, 2016, at 4:14 PM, Kessiler Rodrigues <ke...@callinize.com>
> wrote:
> >
> > I’m still experiencing this issue…
> >
> > Here are the kafka logs.
> >
> > [2016-07-31 20:10:35,658] ERROR Error while accepting connection
> (kafka.network.Acceptor)
> > java.io.IOException: Too many open files
> >    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> >    at kafka.network.Acceptor.accept(SocketServer.scala:323)
> >    at kafka.network.Acceptor.run(SocketServer.scala:268)
> >    at java.lang.Thread.run(Thread.java:745)
> > [2016-07-31 20:10:35,658] ERROR Error while accepting connection
> (kafka.network.Acceptor)
> > java.io.IOException: Too many open files
> >    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> >    at kafka.network.Acceptor.accept(SocketServer.scala:323)
> >    at kafka.network.Acceptor.run(SocketServer.scala:268)
> >    at java.lang.Thread.run(Thread.java:745)
> > [2016-07-31 20:10:35,658] ERROR Error while accepting connection
> (kafka.network.Acceptor)
> > java.io.IOException: Too many open files
> >    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> >    at
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> >    at kafka.network.Acceptor.accept(SocketServer.scala:323)
> >    at kafka.network.Acceptor.run(SocketServer.scala:268)
> >    at java.lang.Thread.run(Thread.java:745)
> >
> > My ulimit is 1 million, how is that possible?
> >
> > Can someone help with this?
> >
> >
> >> On Jul 30, 2016, at 5:05 AM, Kessiler Rodrigues <ke...@callinize.com>
> wrote:
> >>
> >> I have changed it a bit.
> >>
> >> I have 10 brokers and 20k topics with 1 partition each.
> >>
> >> I looked at the kaka’s logs dir and I only have 3318 files.
> >>
> >> I’m doing some tests to see how many topics/partitions I can have, but
> it is throwing too many files once it hits 15k topics..
> >>
> >> Any thoughts?
> >>
> >>
> >>
> >>> On Jul 29, 2016, at 10:33 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >>>
> >>> woah, it looks like you have 15,000 replicas per broker?
> >>>
> >>> You can go into the directory you configured for kafka's log.dir and
> >>> see how many files you have there. Depending on your segment size and
> >>> retention policy, you could have hundreds of files per partition
> >>> there...
> >>>
> >>> Make sure you have at least that many file handles and then also add
> >>> handles for the client connections.
> >>>
> >>> 1 million file handles sound like a lot, but you are running lots of
> >>> partitions per broker...
> >>>
> >>> We normally don't see more than maybe 4000 per broker and most
> >>> clusters have a lot fewer, so consider adding brokers and spreading
> >>> partitions around a bit.
> >>>
> >>> Gwen
> >>>
> >>> On Fri, Jul 29, 2016 at 12:00 PM, Kessiler Rodrigues
> >>> <ke...@callinize.com> wrote:
> >>>> Hi guys,
> >>>>
> >>>> I have been experiencing some issues on kafka, where its throwing too
> many open files.
> >>>>
> >>>> I have around of 6k topics and 5 partitions each.
> >>>>
> >>>> My cluster was made with 6 brokers. All of them are running Ubuntu 16
> and the file limits settings are:
> >>>>
> >>>> `cat  /proc/sys/fs/file-max`
> >>>> 2000000
> >>>>
> >>>> `ulimit -n`
> >>>> 1000000
> >>>>
> >>>> Anyone has experienced it before?
> >
>
>