You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Fikavec F <fi...@yandex.ru> on 2023/02/24 14:47:32 UTC

Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

I'm installed Solr 8.11.1 (SOLR_JAVA_MEM="-Xms31g -Xmx31g") into ram disk in
hi-performance server with 10-Gigabit network adapters. Jumbo Frames (MTU)
enabled and sets to 9000, linux core tcp buffers tunned for 10-Gigabit network
(/etc/sysctl.conf):

  * net.ipv4.tcp_rmem = 8192 87380 134217728
  * net.ipv4.tcp_wmem = 8192 65536 134217728
  * net.core.optmem_max = 268435456
  * net.ipv4.tcp_moderate_rcvbuf = 1
  * net.ipv4.tcp_window_scaling = 1
  * net.ipv4.tcp_sack = 0
  * net.ipv4.tcp_timestamps = 0 
  * net.core.netdev_max_backlog = 300000
  * net.core.somaxconn = 8192
  * net.ipv4.tcp_max_syn_backlog = 8192

Network Throughput with this server tested with iPerf and stable works on 9
Gigabit+ speed.

2\. Client works with Solr without gzip compression:

  *     Set HTTP header - Accept-Encoding: '';

But data recieving speed on simple solr scroll with query *:* on 250Gb
collection (10 shards) by id never speeds up 200 Megabits without jetty tuning
and 350 Megabits with jetty tuning (10GB files from tuned solr jetty (like
/mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads at 1200+
Megabits but from nginx at this server they downloads at 9 Gigabit+ - Jetty
slower nginx, but why solr scroll works 4x slower his jetty server?)

3\. In jetty.xml i'm tried to set bigger buffer (128MB):

  * <Set name="outputBufferSize"><Property name="solr.jetty.output.buffer.size" default="134217728" /></Set> \- and with big search size (512 000) data recieving speedups to 350 Megabits, but the intervals between the scrolls remain large and, as a result, the total full collection scroll data recieve speed is small.
  * tune this option not speedups too - <Set name="outputAggregationSize"><Property name="solr.jetty.output.aggregation.size" default="8192" /></Set> and >= 128MB may throw Solr OOM.

I'm have not bottleneck in ramdisk, cpu, or network - while scrolling system
loads a little. I also tried to put data in one collection and optimize it -
the speed limit is the same in diffferent responce writes (json, xml, csv,
python).

In network calculator (<https://www.switch.ch/network/tools/tcp_throughput/>)
350 Megabits is theoretical maximum with TCP buffer size=8Kb (BW 10000 Mbps;
RTT <= 0.2; TCP buffer size 8Kb)

In source code
(https://github.com/apache/solr/blob/f08f7bb3ef90381078e427f0d164a9f13afe070c/solr/solrj/src/java/org/apache/solr/common/util/FastWriter.java)

I'm found untunable buffer size = 8192 (22 - 27 lines, 8Kb) and think that
this is problem for 1Gigabit+ networks:

> /** Single threaded BufferedWriter Internal Solr use only, subject to
> change. */
>
> public class FastWriter extends Writer {
>
> // use default BUFSIZE of BufferedWriter so if we wrap that
>
> // it won't cause double buffering.
>
> private static final int BUFSIZE = 8192;
>
> protected final Writer sink;

Please make this buffer tunable or tell me how tune Solr for fastest single
thread full big collection data receiving on 1Gigabit+ networks.

Best Regards,


Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Posted by Kevin Risden <kr...@apache.org>.
>
> But data recieving speed on simple solr scroll with query *:* on 250Gb
> collection (10 shards) by id never speeds up 200 Megabits without jetty
> tuning and 350 Megabits with jetty tuning (10GB files from tuned solr jetty
> (like /mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads
> at 1200+ Megabits but from nginx at this server they downloads at 9
> Gigabit+ - Jetty slower nginx, but why solr scroll works 4x slower his
> jetty server?)
>

Ignoring the Solr specific fixed buffer size finding for a second, this
part interests me from some of my previous look at Jetty and Apache Knox (
https://risdenk.github.io/2018/11/13/apache-knox-performance-improvements.html#java---tlsssl-performance).
Do you have TLS enabled by chance? I would expect Jetty without TLS to be
able to send a static file as fast as Nginx or at least within some small %
difference.

From what I gather here, hitting a static file is 1.2 Gigabit/s Jetty and 9
Gigabit/s Nginx for the same file with same client. That looks to be
roughly ~7x slower and I wouldn't expect that at all.

I am not exactly surprised that iterating over a file by id is slower than
a static file. 4x slower is potentially questionable and the fixed buffer
could be an issue. However, there is a lot more work happening to scroll
through documents and create result sets - whereas a static file can
basically be streamed straight from disk/ram without any modifications.
This is why I would question why the static file is ~7x slower than Nginx
first.


Kevin Risden


On Fri, Feb 24, 2023 at 11:46 AM Ishan Chattopadhyaya <
ichattopadhyaya@gmail.com> wrote:

> Nice catch! Sounds reasonable to me that it should be configurable.
> Would you like to open a JIRA to submit a patch for this (or would you
> rather someone else pick it up)?
>
>
> On Fri, Feb 24, 2023 at 9:04 PM Fikavec F <fi...@yandex.ru> wrote:
>
> > I'm installed Solr 8.11.1 (SOLR_JAVA_MEM="-Xms31g -Xmx31g") into ram disk
> > in hi-performance server with 10-Gigabit network adapters. Jumbo Frames
> > (MTU) enabled and sets to 9000, linux core tcp buffers tunned for
> > 10-Gigabit network (/etc/sysctl.conf):
> >
> >    - net.ipv4.tcp_rmem = 8192 87380 134217728
> >    - net.ipv4.tcp_wmem = 8192 65536 134217728
> >    - net.core.optmem_max = 268435456
> >    - net.ipv4.tcp_moderate_rcvbuf = 1
> >    - net.ipv4.tcp_window_scaling = 1
> >    - net.ipv4.tcp_sack = 0
> >    - net.ipv4.tcp_timestamps = 0
> >    - net.core.netdev_max_backlog = 300000
> >    - net.core.somaxconn = 8192
> >    - net.ipv4.tcp_max_syn_backlog = 8192
> >
> > Network Throughput with this server tested with iPerf and stable works on
> > 9 Gigabit+ speed.
> >
> >
> > 2. Client works with Solr without gzip compression:
> >
> >    -
> >
> >    Set HTTP header - Accept-Encoding: '';
> >
> >
> > But data recieving speed on simple solr scroll with query *:* on 250Gb
> > collection (10 shards) by id never speeds up 200 Megabits without jetty
> > tuning and 350 Megabits with jetty tuning (10GB files from tuned solr
> jetty
> > (like /mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads
> > at 1200+ Megabits but from nginx at this server they downloads at 9
> > Gigabit+ - Jetty slower nginx, but why solr scroll works 4x slower his
> > jetty server?)
> >
> > 3. In jetty.xml i'm tried to set bigger buffer (128MB):
> >
> >    - <Set name="outputBufferSize"><Property
> >    name="solr.jetty.output.buffer.size" default="134217728" /></Set> -
> and
> >    with big search size (512 000) data recieving speedups to 350
> Megabits, but
> >    the intervals between the scrolls remain large and, as a result, the
> total
> >    full collection scroll data recieve speed is small.
> >    - tune this option not speedups too - <Set
> >    name="outputAggregationSize"><Property
> >    name="solr.jetty.output.aggregation.size" default="8192" /></Set> and
> >=
> >    128MB may throw Solr OOM.
> >
> > I'm have not bottleneck in ramdisk, cpu, or network - while scrolling
> > system loads a little. I also tried to put data in one collection and
> > optimize it - the speed limit is the same in diffferent responce writes
> > (json, xml, csv, python).
> >
> > In network calculator (
> https://www.switch.ch/network/tools/tcp_throughput/)
> > 350 Megabits is theoretical maximum with TCP buffer size=8Kb (BW 10000
> > Mbps; RTT <= 0.2; TCP buffer size 8Kb)
> >
> > In source code (
> >
> https://github.com/apache/solr/blob/f08f7bb3ef90381078e427f0d164a9f13afe070c/solr/solrj/src/java/org/apache/solr/common/util/FastWriter.java
> > )
> > I'm found untunable buffer size = 8192 (22 - 27 lines, 8Kb) and think
> that
> > this is problem for 1Gigabit+ networks:
> >
> >
> > /** Single threaded BufferedWriter Internal Solr use only, subject to
> > change. */
> > public class FastWriter extends Writer {
> >   // use default BUFSIZE of BufferedWriter so if we wrap that
> >   // it won't cause double buffering.
> >   private static final int BUFSIZE = 8192;
> >   protected final Writer sink;
> >
> >
> > Please make this buffer tunable or tell me how tune Solr for fastest
> > single thread full big collection data receiving on 1Gigabit+ networks.
> >
> > Best Regards,
> >
>

Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
Nice catch! Sounds reasonable to me that it should be configurable.
Would you like to open a JIRA to submit a patch for this (or would you
rather someone else pick it up)?


On Fri, Feb 24, 2023 at 9:04 PM Fikavec F <fi...@yandex.ru> wrote:

> I'm installed Solr 8.11.1 (SOLR_JAVA_MEM="-Xms31g -Xmx31g") into ram disk
> in hi-performance server with 10-Gigabit network adapters. Jumbo Frames
> (MTU) enabled and sets to 9000, linux core tcp buffers tunned for
> 10-Gigabit network (/etc/sysctl.conf):
>
>    - net.ipv4.tcp_rmem = 8192 87380 134217728
>    - net.ipv4.tcp_wmem = 8192 65536 134217728
>    - net.core.optmem_max = 268435456
>    - net.ipv4.tcp_moderate_rcvbuf = 1
>    - net.ipv4.tcp_window_scaling = 1
>    - net.ipv4.tcp_sack = 0
>    - net.ipv4.tcp_timestamps = 0
>    - net.core.netdev_max_backlog = 300000
>    - net.core.somaxconn = 8192
>    - net.ipv4.tcp_max_syn_backlog = 8192
>
> Network Throughput with this server tested with iPerf and stable works on
> 9 Gigabit+ speed.
>
>
> 2. Client works with Solr without gzip compression:
>
>    -
>
>    Set HTTP header - Accept-Encoding: '';
>
>
> But data recieving speed on simple solr scroll with query *:* on 250Gb
> collection (10 shards) by id never speeds up 200 Megabits without jetty
> tuning and 350 Megabits with jetty tuning (10GB files from tuned solr jetty
> (like /mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads
> at 1200+ Megabits but from nginx at this server they downloads at 9
> Gigabit+ - Jetty slower nginx, but why solr scroll works 4x slower his
> jetty server?)
>
> 3. In jetty.xml i'm tried to set bigger buffer (128MB):
>
>    - <Set name="outputBufferSize"><Property
>    name="solr.jetty.output.buffer.size" default="134217728" /></Set> - and
>    with big search size (512 000) data recieving speedups to 350 Megabits, but
>    the intervals between the scrolls remain large and, as a result, the total
>    full collection scroll data recieve speed is small.
>    - tune this option not speedups too - <Set
>    name="outputAggregationSize"><Property
>    name="solr.jetty.output.aggregation.size" default="8192" /></Set> and >=
>    128MB may throw Solr OOM.
>
> I'm have not bottleneck in ramdisk, cpu, or network - while scrolling
> system loads a little. I also tried to put data in one collection and
> optimize it - the speed limit is the same in diffferent responce writes
> (json, xml, csv, python).
>
> In network calculator (https://www.switch.ch/network/tools/tcp_throughput/)
> 350 Megabits is theoretical maximum with TCP buffer size=8Kb (BW 10000
> Mbps; RTT <= 0.2; TCP buffer size 8Kb)
>
> In source code (
> https://github.com/apache/solr/blob/f08f7bb3ef90381078e427f0d164a9f13afe070c/solr/solrj/src/java/org/apache/solr/common/util/FastWriter.java
> )
> I'm found untunable buffer size = 8192 (22 - 27 lines, 8Kb) and think that
> this is problem for 1Gigabit+ networks:
>
>
> /** Single threaded BufferedWriter Internal Solr use only, subject to
> change. */
> public class FastWriter extends Writer {
>   // use default BUFSIZE of BufferedWriter so if we wrap that
>   // it won't cause double buffering.
>   private static final int BUFSIZE = 8192;
>   protected final Writer sink;
>
>
> Please make this buffer tunable or tell me how tune Solr for fastest
> single thread full big collection data receiving on 1Gigabit+ networks.
>
> Best Regards,
>

Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Posted by Mikhail Khludnev <mk...@apache.org>.
Hello,

Can you rebuild the jar with a bigger buffer and benchmark to confirm the
hypothesis?

On Fri, Feb 24, 2023 at 6:34 PM Fikavec F <fi...@yandex.ru> wrote:

> I'm installed Solr 8.11.1 (SOLR_JAVA_MEM="-Xms31g -Xmx31g") into ram disk
> in hi-performance server with 10-Gigabit network adapters. Jumbo Frames
> (MTU) enabled and sets to 9000, linux core tcp buffers tunned for
> 10-Gigabit network (/etc/sysctl.conf):
>
>    - net.ipv4.tcp_rmem = 8192 87380 134217728
>    - net.ipv4.tcp_wmem = 8192 65536 134217728
>    - net.core.optmem_max = 268435456
>    - net.ipv4.tcp_moderate_rcvbuf = 1
>    - net.ipv4.tcp_window_scaling = 1
>    - net.ipv4.tcp_sack = 0
>    - net.ipv4.tcp_timestamps = 0
>    - net.core.netdev_max_backlog = 300000
>    - net.core.somaxconn = 8192
>    - net.ipv4.tcp_max_syn_backlog = 8192
>
> Network Throughput with this server tested with iPerf and stable works on
> 9 Gigabit+ speed.
>
>
> 2. Client works with Solr without gzip compression:
>
>    -
>
>    Set HTTP header - Accept-Encoding: '';
>
>
> But data recieving speed on simple solr scroll with query *:* on 250Gb
> collection (10 shards) by id never speeds up 200 Megabits without jetty
> tuning and 350 Megabits with jetty tuning (10GB files from tuned solr jetty
> (like /mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads
> at 1200+ Megabits but from nginx at this server they downloads at 9
> Gigabit+ - Jetty slower nginx, but why solr scroll works 4x slower his
> jetty server?)
>
> 3. In jetty.xml i'm tried to set bigger buffer (128MB):
>
>    - <Set name="outputBufferSize"><Property
>    name="solr.jetty.output.buffer.size" default="134217728" /></Set> - and
>    with big search size (512 000) data recieving speedups to 350 Megabits, but
>    the intervals between the scrolls remain large and, as a result, the total
>    full collection scroll data recieve speed is small.
>    - tune this option not speedups too - <Set
>    name="outputAggregationSize"><Property
>    name="solr.jetty.output.aggregation.size" default="8192" /></Set> and >=
>    128MB may throw Solr OOM.
>
> I'm have not bottleneck in ramdisk, cpu, or network - while scrolling
> system loads a little. I also tried to put data in one collection and
> optimize it - the speed limit is the same in diffferent responce writes
> (json, xml, csv, python).
>
> In network calculator (https://www.switch.ch/network/tools/tcp_throughput/)
> 350 Megabits is theoretical maximum with TCP buffer size=8Kb (BW 10000
> Mbps; RTT <= 0.2; TCP buffer size 8Kb)
>
> In source code (
> https://github.com/apache/solr/blob/f08f7bb3ef90381078e427f0d164a9f13afe070c/solr/solrj/src/java/org/apache/solr/common/util/FastWriter.java
> )
> I'm found untunable buffer size = 8192 (22 - 27 lines, 8Kb) and think that
> this is problem for 1Gigabit+ networks:
>
>
> /** Single threaded BufferedWriter Internal Solr use only, subject to
> change. */
> public class FastWriter extends Writer {
>   // use default BUFSIZE of BufferedWriter so if we wrap that
>   // it won't cause double buffering.
>   private static final int BUFSIZE = 8192;
>   protected final Writer sink;
>
>
> Please make this buffer tunable or tell me how tune Solr for fastest
> single thread full big collection data receiving on 1Gigabit+ networks.
>
> Best Regards,
>


-- 
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!

Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Posted by David Smiley <ds...@apache.org>.
You used the word "scroll" a lot.  Can you elaborate?
Search is generally optimized for returning top-X where X is not large.  My
suspicion is that you want lots of results back.  You might want to use
cursorMark as described here:
https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html
 On the other hand if you have only one Solr core, then maybe it doesn't
matter relative to a massive rows param.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Feb 24, 2023 at 10:34 AM Fikavec F <fi...@yandex.ru> wrote:

> I'm installed Solr 8.11.1 (SOLR_JAVA_MEM="-Xms31g -Xmx31g") into ram disk
> in hi-performance server with 10-Gigabit network adapters. Jumbo Frames
> (MTU) enabled and sets to 9000, linux core tcp buffers tunned for
> 10-Gigabit network (/etc/sysctl.conf):
>
>    - net.ipv4.tcp_rmem = 8192 87380 134217728
>    - net.ipv4.tcp_wmem = 8192 65536 134217728
>    - net.core.optmem_max = 268435456
>    - net.ipv4.tcp_moderate_rcvbuf = 1
>    - net.ipv4.tcp_window_scaling = 1
>    - net.ipv4.tcp_sack = 0
>    - net.ipv4.tcp_timestamps = 0
>    - net.core.netdev_max_backlog = 300000
>    - net.core.somaxconn = 8192
>    - net.ipv4.tcp_max_syn_backlog = 8192
>
> Network Throughput with this server tested with iPerf and stable works on
> 9 Gigabit+ speed.
>
>
> 2. Client works with Solr without gzip compression:
>
>    -
>
>    Set HTTP header - Accept-Encoding: '';
>
>
> But data recieving speed on simple solr scroll with query *:* on 250Gb
> collection (10 shards) by id never speeds up 200 Megabits without jetty
> tuning and 350 Megabits with jetty tuning (10GB files from tuned solr jetty
> (like /mnt/ramdisk/solr/server/solr-webapp/webapp/testfile.bin) downloads
> at 1200+ Megabits but from nginx at this server they downloads at 9
> Gigabit+ - Jetty slower nginx, but why solr scroll works 4x slower his
> jetty server?)
>
> 3. In jetty.xml i'm tried to set bigger buffer (128MB):
>
>    - <Set name="outputBufferSize"><Property
>    name="solr.jetty.output.buffer.size" default="134217728" /></Set> - and
>    with big search size (512 000) data recieving speedups to 350 Megabits, but
>    the intervals between the scrolls remain large and, as a result, the total
>    full collection scroll data recieve speed is small.
>    - tune this option not speedups too - <Set
>    name="outputAggregationSize"><Property
>    name="solr.jetty.output.aggregation.size" default="8192" /></Set> and >=
>    128MB may throw Solr OOM.
>
> I'm have not bottleneck in ramdisk, cpu, or network - while scrolling
> system loads a little. I also tried to put data in one collection and
> optimize it - the speed limit is the same in diffferent responce writes
> (json, xml, csv, python).
>
> In network calculator (https://www.switch.ch/network/tools/tcp_throughput/)
> 350 Megabits is theoretical maximum with TCP buffer size=8Kb (BW 10000
> Mbps; RTT <= 0.2; TCP buffer size 8Kb)
>
> In source code (
> https://github.com/apache/solr/blob/f08f7bb3ef90381078e427f0d164a9f13afe070c/solr/solrj/src/java/org/apache/solr/common/util/FastWriter.java
> )
> I'm found untunable buffer size = 8192 (22 - 27 lines, 8Kb) and think that
> this is problem for 1Gigabit+ networks:
>
>
> /** Single threaded BufferedWriter Internal Solr use only, subject to
> change. */
> public class FastWriter extends Writer {
>   // use default BUFSIZE of BufferedWriter so if we wrap that
>   // it won't cause double buffering.
>   private static final int BUFSIZE = 8192;
>   protected final Writer sink;
>
>
> Please make this buffer tunable or tell me how tune Solr for fastest
> single thread full big collection data receiving on 1Gigabit+ networks.
>
> Best Regards,
>