You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Alan Carroll <so...@verizonmedia.com> on 2020/05/13 23:51:15 UTC

Re: cache read and write failures

Cache misses count as read failures, so you should expect a lot of that on
an empty cache. Write failures can be caused by clients giving up on the
transaction. So I'd take those numbers with a bit of caution. If actual
system level read/write failures happen more than 4 or 5 times in a row,
the disk is taken off line.

On Tue, Apr 28, 2020 at 11:35 AM edd! <me...@eddi.me> wrote:

> Interesting figures, especially that the failures seem to persist even on
> different major releases.
> I initially thought the failures existed because I am testing ATS in a VM.
> So, yesterday I installed the same OS and ATS release, compiled the source
> as provided, on a bare HP server with 2x240 GB SSD in RAID1 for the OS and
> 2 x 1TB each in its own volume (RAID0) for the cache storage... However, I
> still have failures (see below)
>
> "proxy.process.version.server.short": "8.0.7",
> "proxy.process.version.server.long": "Apache Traffic Server - traffic_server - 8.0.7 - (build # 042720 on Apr 27 2020 at 20:11:28)",
> "proxy.process.version.server.build_number": "042720",
> "proxy.process.version.server.build_time": "20:11:28",
> "proxy.process.version.server.build_date": "Apr 27 2020",
> "proxy.process.version.server.build_person": "root",
> "proxy.process.http.background_fill_current_count": "4",
> "proxy.process.http.current_client_connections": "640",
> "proxy.process.http.current_active_client_connections": "122",
> "proxy.process.http.websocket.current_active_client_connections": "0",
> "proxy.process.http.current_client_transactions": "261",
> "proxy.process.http.current_server_transactions": "184",
> "proxy.process.http.current_parent_proxy_connections": "0",
> "proxy.process.http.current_server_connections": "805",
> "proxy.process.http.current_cache_connections": "155",
>
> "proxy.process.cache.volume_0.bytes_used": "65715343360",
> "proxy.process.cache.volume_0.bytes_total": "1917932191744",
> "proxy.process.cache.volume_0.ram_cache.total_bytes": "25769803776",
> "proxy.process.cache.volume_0.ram_cache.bytes_used": "770314880",
> "proxy.process.cache.volume_0.ram_cache.hits": "541749",
> "proxy.process.cache.volume_0.ram_cache.misses": "115283",
> "proxy.process.cache.volume_0.pread_count": "0",
> "proxy.process.cache.volume_0.percent_full": "3",
> "proxy.process.cache.volume_0.lookup.active": "0",
> "proxy.process.cache.volume_0.lookup.success": "0",
> "proxy.process.cache.volume_0.lookup.failure": "0",
> "proxy.process.cache.volume_0.read.active": "4",
> "proxy.process.cache.volume_0.read.success": "525727","proxy.process.cache.volume_0.read.failure": "1290873",
> "proxy.process.cache.volume_0.write.active": "153",
> "proxy.process.cache.volume_0.write.success": "222118","proxy.process.cache.volume_0.write.failure": "42813",
> "proxy.process.cache.volume_0.write.backlog.failure": "0",
> "proxy.process.cache.volume_0.update.active": "1",
> "proxy.process.cache.volume_0.update.success": "33634","proxy.process.cache.volume_0.update.failure": "533",
> "proxy.process.cache.volume_0.remove.active": "0",
> "proxy.process.cache.volume_0.remove.success": "0",
> "proxy.process.cache.volume_0.remove.failure": "0",
> "proxy.process.cache.volume_0.evacuate.active": "0",
> "proxy.process.cache.volume_0.evacuate.success": "0",
> "proxy.process.cache.volume_0.evacuate.failure": "0",
> "proxy.process.cache.volume_0.scan.active": "0",
> "proxy.process.cache.volume_0.scan.success": "0",
> "proxy.process.cache.volume_0.scan.failure": "0",
> "proxy.process.cache.volume_0.direntries.total": "239453928",
> "proxy.process.cache.volume_0.direntries.used": "274305",
> "proxy.process.cache.volume_0.directory_collision": "0",
> "proxy.process.cache.volume_0.frags_per_doc.1": "228249",
> "proxy.process.cache.volume_0.frags_per_doc.2": "0",
> "proxy.process.cache.volume_0.frags_per_doc.3+": "10884",
> "proxy.process.cache.volume_0.read_busy.success": "105","proxy.process.cache.volume_0.read_busy.failure": "52600",
> "proxy.process.cache.volume_0.write_bytes_stat": "0",
> "proxy.process.cache.volume_0.vector_marshals": "0",
> "proxy.process.cache.volume_0.hdr_marshals": "0",
> "proxy.process.cache.volume_0.hdr_marshal_bytes": "0",
> "proxy.process.cache.volume_0.gc_bytes_evacuated": "0",
> "proxy.process.cache.volume_0.gc_frags_evacuated": "0",
> "proxy.process.cache.volume_0.wrap_count": "0",
> "proxy.process.cache.volume_0.sync.count": "252",
> "proxy.process.cache.volume_0.sync.bytes": "302145806336",
> "proxy.process.cache.volume_0.sync.time": "72939367496484",
> "proxy.process.cache.volume_0.span.errors.read": "0",
> "proxy.process.cache.volume_0.span.errors.write": "0",
> "proxy.process.cache.volume_0.span.failing": "0",
> "proxy.process.cache.volume_0.span.offline": "0",
> "proxy.process.cache.volume_0.span.online": "0",
> "server": "8.0.7"
>
>
> On Tue, Apr 28, 2020 at 6:58 PM Bryan Call <bc...@apache.org> wrote:
>
>> Here are some numbers from our production servers, from a couple
>> different groups.  We are running a heavily modified version of 7.1.2
>>
>> One of our production server for our CDN:
>> proxy.process.cache.read.success 6037258
>> proxy.process.cache.read.failure 13845799
>>
>> Here is another one from another group:
>> proxy.process.cache.read.success 5575072
>> proxy.process.cache.read.failure 26784750
>>
>> I talked to another company and they were seeing about a 3% failure rate
>> on their cache and they are using 8.0.7.  I created an issue on this:
>> https://github.com/apache/trafficserver/issues/6713
>>
>> -Bryan
>>
>> > On Apr 26, 2020, at 4:32 PM, edd! <me...@eddi.me> wrote:
>> >
>> > Hi,
>> >
>> > New to ATS, I compiled 8.0.7 from source yesterday on CentOS 7
>> > # source /opt/rh/devtoolset-7/enable
>> > # ./configure --enable-experimental-plugins
>> > # make && make install
>> > Testing it as a transparent forward proxy serving ~500 users with http
>> caching enabled. I have first tried raw cache storage then volume file but
>> in both cases I got so many read and few write failures on a 100 GB SSD
>> partition.
>> >
>> > "proxy.process.cache.volume_0.bytes_used": "3323351040",
>> > "proxy.process.cache.volume_0.bytes_total": "106167836672",
>> > "proxy.process.cache.volume_0.ram_cache.total_bytes": "12884901888",
>> > "proxy.process.cache.volume_0.ram_cache.bytes_used": "6062080",
>> > "proxy.process.cache.volume_0.ram_cache.hits": "4916",
>> > "proxy.process.cache.volume_0.ram_cache.misses": "1411",
>> > "proxy.process.cache.volume_0.pread_count": "0",
>> > "proxy.process.cache.volume_0.percent_full": "3",
>> > "proxy.process.cache.volume_0.lookup.active": "0",
>> > "proxy.process.cache.volume_0.lookup.success": "0",
>> > "proxy.process.cache.volume_0.lookup.failure": "0",
>> > "proxy.process.cache.volume_0.read.active": "1",
>> > "proxy.process.cache.volume_0.read.success": "5566",
>> > "proxy.process.cache.volume_0.read.failure": "22084",
>> > "proxy.process.cache.volume_0.write.active": "8",
>> > "proxy.process.cache.volume_0.write.success": "5918",
>> > "proxy.process.cache.volume_0.write.failure": "568",
>> > "proxy.process.cache.volume_0.write.backlog.failure": "272",
>> > "proxy.process.cache.volume_0.update.active": "1",
>> > "proxy.process.cache.volume_0.update.success": "306",
>> > "proxy.process.cache.volume_0.update.failure": "4",
>> > "proxy.process.cache.volume_0.remove.active": "0",
>> > "proxy.process.cache.volume_0.remove.success": "0",
>> > "proxy.process.cache.volume_0.remove.failure": "0",
>> > "proxy.process.cache.volume_0.evacuate.active": "0",
>> > "proxy.process.cache.volume_0.evacuate.success": "0",
>> > "proxy.process.cache.volume_0.evacuate.failure": "0",
>> > "proxy.process.cache.volume_0.scan.active": "0",
>> > "proxy.process.cache.volume_0.scan.success": "0",
>> > "proxy.process.cache.volume_0.scan.failure": "0",
>> > "proxy.process.cache.volume_0.direntries.total": "13255088",
>> > "proxy.process.cache.volume_0.direntries.used": "9230",
>> > "proxy.process.cache.volume_0.directory_collision": "0",
>> > "proxy.process.cache.volume_0.frags_per_doc.1": "5372",
>> > "proxy.process.cache.volume_0.frags_per_doc.2": "0",
>> > "proxy.process.cache.volume_0.frags_per_doc.3+": "625",
>> > "proxy.process.cache.volume_0.read_busy.success": "4",
>> > "proxy.process.cache.volume_0.read_busy.failure": "351",
>> >
>> > Disk read/write tests:
>> > hdparm -t /dev/sdb1
>> > /dev/sdb1:
>> >  Timing buffered disk reads: 196 MB in  3.24 seconds =  60.54 MB/sec
>> >
>> > hdparm -T /dev/sdb1
>> > /dev/sdb1:
>> >  Timing cached reads:   11662 MB in  1.99 seconds = 5863.27 MB/sec
>> >
>> > dd if=/dev/zero of=/cache/test1.img bs=1G count=1 oflag=dsync
>> > 1+0 records in
>> > 1+0 records out
>> > 1073741824 bytes (1.1 GB) copied, 67.6976 s, 15.9 MB/s
>> >
>> > dd if=/dev/zero of=/cache/test2.img bs=512 count=1000 oflag=dsync
>> > 1000+0 records in
>> > 1000+0 records out
>> > 512000 bytes (512 kB) copied, 0.374173 s, 1.4 MB/s
>> >
>> > Please help,
>> >
>> > Thank you,
>> > Eddi
>>
>>

Re: cache read and write failures

Posted by edd! <me...@eddi.me>.
I agree

Thank you guys

On Thu, 14 May 2020, 03:29 Leif Hedstrom, <zw...@apache.org> wrote:

> Also, all this was discussed on the issue that was opened :). Which I
> think we should close, these metrics works exactly as expected albeit
> somewhat confusing.
>
> — Leif
>
> On May 13, 2020, at 17:51, Alan Carroll <so...@verizonmedia.com>
> wrote:
>
> 
> Cache misses count as read failures, so you should expect a lot of that on
> an empty cache. Write failures can be caused by clients giving up on the
> transaction. So I'd take those numbers with a bit of caution. If actual
> system level read/write failures happen more than 4 or 5 times in a row,
> the disk is taken off line.
>
> On Tue, Apr 28, 2020 at 11:35 AM edd! <me...@eddi.me> wrote:
>
>> Interesting figures, especially that the failures seem to persist even on
>> different major releases.
>> I initially thought the failures existed because I am testing ATS in a
>> VM. So, yesterday I installed the same OS and ATS release, compiled the
>> source as provided, on a bare HP server with 2x240 GB SSD in RAID1 for the
>> OS and 2 x 1TB each in its own volume (RAID0) for the cache storage...
>> However, I still have failures (see below)
>>
>> "proxy.process.version.server.short": "8.0.7",
>> "proxy.process.version.server.long": "Apache Traffic Server - traffic_server - 8.0.7 - (build # 042720 on Apr 27 2020 at 20:11:28)",
>> "proxy.process.version.server.build_number": "042720",
>> "proxy.process.version.server.build_time": "20:11:28",
>> "proxy.process.version.server.build_date": "Apr 27 2020",
>> "proxy.process.version.server.build_person": "root",
>> "proxy.process.http.background_fill_current_count": "4",
>> "proxy.process.http.current_client_connections": "640",
>> "proxy.process.http.current_active_client_connections": "122",
>> "proxy.process.http.websocket.current_active_client_connections": "0",
>> "proxy.process.http.current_client_transactions": "261",
>> "proxy.process.http.current_server_transactions": "184",
>> "proxy.process.http.current_parent_proxy_connections": "0",
>> "proxy.process.http.current_server_connections": "805",
>> "proxy.process.http.current_cache_connections": "155",
>>
>> "proxy.process.cache.volume_0.bytes_used": "65715343360",
>> "proxy.process.cache.volume_0.bytes_total": "1917932191744",
>> "proxy.process.cache.volume_0.ram_cache.total_bytes": "25769803776",
>> "proxy.process.cache.volume_0.ram_cache.bytes_used": "770314880",
>> "proxy.process.cache.volume_0.ram_cache.hits": "541749",
>> "proxy.process.cache.volume_0.ram_cache.misses": "115283",
>> "proxy.process.cache.volume_0.pread_count": "0",
>> "proxy.process.cache.volume_0.percent_full": "3",
>> "proxy.process.cache.volume_0.lookup.active": "0",
>> "proxy.process.cache.volume_0.lookup.success": "0",
>> "proxy.process.cache.volume_0.lookup.failure": "0",
>> "proxy.process.cache.volume_0.read.active": "4",
>> "proxy.process.cache.volume_0.read.success": "525727","proxy.process.cache.volume_0.read.failure": "1290873",
>> "proxy.process.cache.volume_0.write.active": "153",
>> "proxy.process.cache.volume_0.write.success": "222118","proxy.process.cache.volume_0.write.failure": "42813",
>> "proxy.process.cache.volume_0.write.backlog.failure": "0",
>> "proxy.process.cache.volume_0.update.active": "1",
>> "proxy.process.cache.volume_0.update.success": "33634","proxy.process.cache.volume_0.update.failure": "533",
>> "proxy.process.cache.volume_0.remove.active": "0",
>> "proxy.process.cache.volume_0.remove.success": "0",
>> "proxy.process.cache.volume_0.remove.failure": "0",
>> "proxy.process.cache.volume_0.evacuate.active": "0",
>> "proxy.process.cache.volume_0.evacuate.success": "0",
>> "proxy.process.cache.volume_0.evacuate.failure": "0",
>> "proxy.process.cache.volume_0.scan.active": "0",
>> "proxy.process.cache.volume_0.scan.success": "0",
>> "proxy.process.cache.volume_0.scan.failure": "0",
>> "proxy.process.cache.volume_0.direntries.total": "239453928",
>> "proxy.process.cache.volume_0.direntries.used": "274305",
>> "proxy.process.cache.volume_0.directory_collision": "0",
>> "proxy.process.cache.volume_0.frags_per_doc.1": "228249",
>> "proxy.process.cache.volume_0.frags_per_doc.2": "0",
>> "proxy.process.cache.volume_0.frags_per_doc.3+": "10884",
>> "proxy.process.cache.volume_0.read_busy.success": "105","proxy.process.cache.volume_0.read_busy.failure": "52600",
>> "proxy.process.cache.volume_0.write_bytes_stat": "0",
>> "proxy.process.cache.volume_0.vector_marshals": "0",
>> "proxy.process.cache.volume_0.hdr_marshals": "0",
>> "proxy.process.cache.volume_0.hdr_marshal_bytes": "0",
>> "proxy.process.cache.volume_0.gc_bytes_evacuated": "0",
>> "proxy.process.cache.volume_0.gc_frags_evacuated": "0",
>> "proxy.process.cache.volume_0.wrap_count": "0",
>> "proxy.process.cache.volume_0.sync.count": "252",
>> "proxy.process.cache.volume_0.sync.bytes": "302145806336",
>> "proxy.process.cache.volume_0.sync.time": "72939367496484",
>> "proxy.process.cache.volume_0.span.errors.read": "0",
>> "proxy.process.cache.volume_0.span.errors.write": "0",
>> "proxy.process.cache.volume_0.span.failing": "0",
>> "proxy.process.cache.volume_0.span.offline": "0",
>> "proxy.process.cache.volume_0.span.online": "0",
>> "server": "8.0.7"
>>
>>
>> On Tue, Apr 28, 2020 at 6:58 PM Bryan Call <bc...@apache.org> wrote:
>>
>>> Here are some numbers from our production servers, from a couple
>>> different groups.  We are running a heavily modified version of 7.1.2
>>>
>>> One of our production server for our CDN:
>>> proxy.process.cache.read.success 6037258
>>> proxy.process.cache.read.failure 13845799
>>>
>>> Here is another one from another group:
>>> proxy.process.cache.read.success 5575072
>>> proxy.process.cache.read.failure 26784750
>>>
>>> I talked to another company and they were seeing about a 3% failure rate
>>> on their cache and they are using 8.0.7.  I created an issue on this:
>>> https://github.com/apache/trafficserver/issues/6713
>>>
>>> -Bryan
>>>
>>> > On Apr 26, 2020, at 4:32 PM, edd! <me...@eddi.me> wrote:
>>> >
>>> > Hi,
>>> >
>>> > New to ATS, I compiled 8.0.7 from source yesterday on CentOS 7
>>> > # source /opt/rh/devtoolset-7/enable
>>> > # ./configure --enable-experimental-plugins
>>> > # make && make install
>>> > Testing it as a transparent forward proxy serving ~500 users with http
>>> caching enabled. I have first tried raw cache storage then volume file but
>>> in both cases I got so many read and few write failures on a 100 GB SSD
>>> partition.
>>> >
>>> > "proxy.process.cache.volume_0.bytes_used": "3323351040",
>>> > "proxy.process.cache.volume_0.bytes_total": "106167836672",
>>> > "proxy.process.cache.volume_0.ram_cache.total_bytes": "12884901888",
>>> > "proxy.process.cache.volume_0.ram_cache.bytes_used": "6062080",
>>> > "proxy.process.cache.volume_0.ram_cache.hits": "4916",
>>> > "proxy.process.cache.volume_0.ram_cache.misses": "1411",
>>> > "proxy.process.cache.volume_0.pread_count": "0",
>>> > "proxy.process.cache.volume_0.percent_full": "3",
>>> > "proxy.process.cache.volume_0.lookup.active": "0",
>>> > "proxy.process.cache.volume_0.lookup.success": "0",
>>> > "proxy.process.cache.volume_0.lookup.failure": "0",
>>> > "proxy.process.cache.volume_0.read.active": "1",
>>> > "proxy.process.cache.volume_0.read.success": "5566",
>>> > "proxy.process.cache.volume_0.read.failure": "22084",
>>> > "proxy.process.cache.volume_0.write.active": "8",
>>> > "proxy.process.cache.volume_0.write.success": "5918",
>>> > "proxy.process.cache.volume_0.write.failure": "568",
>>> > "proxy.process.cache.volume_0.write.backlog.failure": "272",
>>> > "proxy.process.cache.volume_0.update.active": "1",
>>> > "proxy.process.cache.volume_0.update.success": "306",
>>> > "proxy.process.cache.volume_0.update.failure": "4",
>>> > "proxy.process.cache.volume_0.remove.active": "0",
>>> > "proxy.process.cache.volume_0.remove.success": "0",
>>> > "proxy.process.cache.volume_0.remove.failure": "0",
>>> > "proxy.process.cache.volume_0.evacuate.active": "0",
>>> > "proxy.process.cache.volume_0.evacuate.success": "0",
>>> > "proxy.process.cache.volume_0.evacuate.failure": "0",
>>> > "proxy.process.cache.volume_0.scan.active": "0",
>>> > "proxy.process.cache.volume_0.scan.success": "0",
>>> > "proxy.process.cache.volume_0.scan.failure": "0",
>>> > "proxy.process.cache.volume_0.direntries.total": "13255088",
>>> > "proxy.process.cache.volume_0.direntries.used": "9230",
>>> > "proxy.process.cache.volume_0.directory_collision": "0",
>>> > "proxy.process.cache.volume_0.frags_per_doc.1": "5372",
>>> > "proxy.process.cache.volume_0.frags_per_doc.2": "0",
>>> > "proxy.process.cache.volume_0.frags_per_doc.3+": "625",
>>> > "proxy.process.cache.volume_0.read_busy.success": "4",
>>> > "proxy.process.cache.volume_0.read_busy.failure": "351",
>>> >
>>> > Disk read/write tests:
>>> > hdparm -t /dev/sdb1
>>> > /dev/sdb1:
>>> >  Timing buffered disk reads: 196 MB in  3.24 seconds =  60.54 MB/sec
>>> >
>>> > hdparm -T /dev/sdb1
>>> > /dev/sdb1:
>>> >  Timing cached reads:   11662 MB in  1.99 seconds = 5863.27 MB/sec
>>> >
>>> > dd if=/dev/zero of=/cache/test1.img bs=1G count=1 oflag=dsync
>>> > 1+0 records in
>>> > 1+0 records out
>>> > 1073741824 bytes (1.1 GB) copied, 67.6976 s, 15.9 MB/s
>>> >
>>> > dd if=/dev/zero of=/cache/test2.img bs=512 count=1000 oflag=dsync
>>> > 1000+0 records in
>>> > 1000+0 records out
>>> > 512000 bytes (512 kB) copied, 0.374173 s, 1.4 MB/s
>>> >
>>> > Please help,
>>> >
>>> > Thank you,
>>> > Eddi
>>>
>>>

Re: cache read and write failures

Posted by Leif Hedstrom <zw...@apache.org>.
Also, all this was discussed on the issue that was opened :). Which I think we should close, these metrics works exactly as expected albeit somewhat confusing.

— Leif 

> On May 13, 2020, at 17:51, Alan Carroll <so...@verizonmedia.com> wrote:
> 
> 
> Cache misses count as read failures, so you should expect a lot of that on an empty cache. Write failures can be caused by clients giving up on the transaction. So I'd take those numbers with a bit of caution. If actual system level read/write failures happen more than 4 or 5 times in a row, the disk is taken off line. 
> 
>> On Tue, Apr 28, 2020 at 11:35 AM edd! <me...@eddi.me> wrote:
>> Interesting figures, especially that the failures seem to persist even on different major releases.
>> I initially thought the failures existed because I am testing ATS in a VM. So, yesterday I installed the same OS and ATS release, compiled the source as provided, on a bare HP server with 2x240 GB SSD in RAID1 for the OS and 2 x 1TB each in its own volume (RAID0) for the cache storage... However, I still have failures (see below)
>> "proxy.process.version.server.short": "8.0.7",
>> "proxy.process.version.server.long": "Apache Traffic Server - traffic_server - 8.0.7 - (build # 042720 on Apr 27 2020 at 20:11:28)",
>> "proxy.process.version.server.build_number": "042720",
>> "proxy.process.version.server.build_time": "20:11:28",
>> "proxy.process.version.server.build_date": "Apr 27 2020",
>> "proxy.process.version.server.build_person": "root",
>> "proxy.process.http.background_fill_current_count": "4",
>> "proxy.process.http.current_client_connections": "640",
>> "proxy.process.http.current_active_client_connections": "122",
>> "proxy.process.http.websocket.current_active_client_connections": "0",
>> "proxy.process.http.current_client_transactions": "261",
>> "proxy.process.http.current_server_transactions": "184",
>> "proxy.process.http.current_parent_proxy_connections": "0",
>> "proxy.process.http.current_server_connections": "805",
>> "proxy.process.http.current_cache_connections": "155",
>> "proxy.process.cache.volume_0.bytes_used": "65715343360",
>> "proxy.process.cache.volume_0.bytes_total": "1917932191744",
>> "proxy.process.cache.volume_0.ram_cache.total_bytes": "25769803776",
>> "proxy.process.cache.volume_0.ram_cache.bytes_used": "770314880",
>> "proxy.process.cache.volume_0.ram_cache.hits": "541749",
>> "proxy.process.cache.volume_0.ram_cache.misses": "115283",
>> "proxy.process.cache.volume_0.pread_count": "0",
>> "proxy.process.cache.volume_0.percent_full": "3",
>> "proxy.process.cache.volume_0.lookup.active": "0",
>> "proxy.process.cache.volume_0.lookup.success": "0",
>> "proxy.process.cache.volume_0.lookup.failure": "0",
>> "proxy.process.cache.volume_0.read.active": "4",
>> "proxy.process.cache.volume_0.read.success": "525727",
>> "proxy.process.cache.volume_0.read.failure": "1290873",
>> "proxy.process.cache.volume_0.write.active": "153",
>> "proxy.process.cache.volume_0.write.success": "222118",
>> "proxy.process.cache.volume_0.write.failure": "42813",
>> "proxy.process.cache.volume_0.write.backlog.failure": "0",
>> "proxy.process.cache.volume_0.update.active": "1",
>> "proxy.process.cache.volume_0.update.success": "33634",
>> "proxy.process.cache.volume_0.update.failure": "533",
>> "proxy.process.cache.volume_0.remove.active": "0",
>> "proxy.process.cache.volume_0.remove.success": "0",
>> "proxy.process.cache.volume_0.remove.failure": "0",
>> "proxy.process.cache.volume_0.evacuate.active": "0",
>> "proxy.process.cache.volume_0.evacuate.success": "0",
>> "proxy.process.cache.volume_0.evacuate.failure": "0",
>> "proxy.process.cache.volume_0.scan.active": "0",
>> "proxy.process.cache.volume_0.scan.success": "0",
>> "proxy.process.cache.volume_0.scan.failure": "0",
>> "proxy.process.cache.volume_0.direntries.total": "239453928",
>> "proxy.process.cache.volume_0.direntries.used": "274305",
>> "proxy.process.cache.volume_0.directory_collision": "0",
>> "proxy.process.cache.volume_0.frags_per_doc.1": "228249",
>> "proxy.process.cache.volume_0.frags_per_doc.2": "0",
>> "proxy.process.cache.volume_0.frags_per_doc.3+": "10884",
>> "proxy.process.cache.volume_0.read_busy.success": "105",
>> "proxy.process.cache.volume_0.read_busy.failure": "52600",
>> "proxy.process.cache.volume_0.write_bytes_stat": "0",
>> "proxy.process.cache.volume_0.vector_marshals": "0",
>> "proxy.process.cache.volume_0.hdr_marshals": "0",
>> "proxy.process.cache.volume_0.hdr_marshal_bytes": "0",
>> "proxy.process.cache.volume_0.gc_bytes_evacuated": "0",
>> "proxy.process.cache.volume_0.gc_frags_evacuated": "0",
>> "proxy.process.cache.volume_0.wrap_count": "0",
>> "proxy.process.cache.volume_0.sync.count": "252",
>> "proxy.process.cache.volume_0.sync.bytes": "302145806336",
>> "proxy.process.cache.volume_0.sync.time": "72939367496484",
>> "proxy.process.cache.volume_0.span.errors.read": "0",
>> "proxy.process.cache.volume_0.span.errors.write": "0",
>> "proxy.process.cache.volume_0.span.failing": "0",
>> "proxy.process.cache.volume_0.span.offline": "0",
>> "proxy.process.cache.volume_0.span.online": "0",
>> "server": "8.0.7"
>> 
>>> On Tue, Apr 28, 2020 at 6:58 PM Bryan Call <bc...@apache.org> wrote:
>>> Here are some numbers from our production servers, from a couple different groups.  We are running a heavily modified version of 7.1.2
>>> 
>>> One of our production server for our CDN:
>>> proxy.process.cache.read.success 6037258
>>> proxy.process.cache.read.failure 13845799
>>> 
>>> Here is another one from another group:
>>> proxy.process.cache.read.success 5575072
>>> proxy.process.cache.read.failure 26784750
>>> 
>>> I talked to another company and they were seeing about a 3% failure rate on their cache and they are using 8.0.7.  I created an issue on this: https://github.com/apache/trafficserver/issues/6713
>>> 
>>> -Bryan
>>> 
>>> > On Apr 26, 2020, at 4:32 PM, edd! <me...@eddi.me> wrote:
>>> > 
>>> > Hi,
>>> > 
>>> > New to ATS, I compiled 8.0.7 from source yesterday on CentOS 7
>>> > # source /opt/rh/devtoolset-7/enable
>>> > # ./configure --enable-experimental-plugins
>>> > # make && make install
>>> > Testing it as a transparent forward proxy serving ~500 users with http caching enabled. I have first tried raw cache storage then volume file but in both cases I got so many read and few write failures on a 100 GB SSD partition.
>>> > 
>>> > "proxy.process.cache.volume_0.bytes_used": "3323351040",
>>> > "proxy.process.cache.volume_0.bytes_total": "106167836672",
>>> > "proxy.process.cache.volume_0.ram_cache.total_bytes": "12884901888",
>>> > "proxy.process.cache.volume_0.ram_cache.bytes_used": "6062080",
>>> > "proxy.process.cache.volume_0.ram_cache.hits": "4916",
>>> > "proxy.process.cache.volume_0.ram_cache.misses": "1411",
>>> > "proxy.process.cache.volume_0.pread_count": "0",
>>> > "proxy.process.cache.volume_0.percent_full": "3",
>>> > "proxy.process.cache.volume_0.lookup.active": "0",
>>> > "proxy.process.cache.volume_0.lookup.success": "0",
>>> > "proxy.process.cache.volume_0.lookup.failure": "0",
>>> > "proxy.process.cache.volume_0.read.active": "1",
>>> > "proxy.process.cache.volume_0.read.success": "5566",
>>> > "proxy.process.cache.volume_0.read.failure": "22084",
>>> > "proxy.process.cache.volume_0.write.active": "8",
>>> > "proxy.process.cache.volume_0.write.success": "5918",
>>> > "proxy.process.cache.volume_0.write.failure": "568",
>>> > "proxy.process.cache.volume_0.write.backlog.failure": "272",
>>> > "proxy.process.cache.volume_0.update.active": "1",
>>> > "proxy.process.cache.volume_0.update.success": "306",
>>> > "proxy.process.cache.volume_0.update.failure": "4",
>>> > "proxy.process.cache.volume_0.remove.active": "0",
>>> > "proxy.process.cache.volume_0.remove.success": "0",
>>> > "proxy.process.cache.volume_0.remove.failure": "0",
>>> > "proxy.process.cache.volume_0.evacuate.active": "0",
>>> > "proxy.process.cache.volume_0.evacuate.success": "0",
>>> > "proxy.process.cache.volume_0.evacuate.failure": "0",
>>> > "proxy.process.cache.volume_0.scan.active": "0",
>>> > "proxy.process.cache.volume_0.scan.success": "0",
>>> > "proxy.process.cache.volume_0.scan.failure": "0",
>>> > "proxy.process.cache.volume_0.direntries.total": "13255088",
>>> > "proxy.process.cache.volume_0.direntries.used": "9230",
>>> > "proxy.process.cache.volume_0.directory_collision": "0",
>>> > "proxy.process.cache.volume_0.frags_per_doc.1": "5372",
>>> > "proxy.process.cache.volume_0.frags_per_doc.2": "0",
>>> > "proxy.process.cache.volume_0.frags_per_doc.3+": "625",
>>> > "proxy.process.cache.volume_0.read_busy.success": "4",
>>> > "proxy.process.cache.volume_0.read_busy.failure": "351",
>>> > 
>>> > Disk read/write tests:
>>> > hdparm -t /dev/sdb1
>>> > /dev/sdb1:
>>> >  Timing buffered disk reads: 196 MB in  3.24 seconds =  60.54 MB/sec
>>> > 
>>> > hdparm -T /dev/sdb1
>>> > /dev/sdb1:
>>> >  Timing cached reads:   11662 MB in  1.99 seconds = 5863.27 MB/sec
>>> > 
>>> > dd if=/dev/zero of=/cache/test1.img bs=1G count=1 oflag=dsync   
>>> > 1+0 records in
>>> > 1+0 records out
>>> > 1073741824 bytes (1.1 GB) copied, 67.6976 s, 15.9 MB/s
>>> > 
>>> > dd if=/dev/zero of=/cache/test2.img bs=512 count=1000 oflag=dsync   
>>> > 1000+0 records in
>>> > 1000+0 records out
>>> > 512000 bytes (512 kB) copied, 0.374173 s, 1.4 MB/s
>>> > 
>>> > Please help,
>>> > 
>>> > Thank you,
>>> > Eddi
>>>