-
Notifications
You must be signed in to change notification settings - Fork 399
Open
Description
Since updating to 7.7.0(debian testing) from 6.x we are getting a spew of 503's that happens days after the cache is filled up
/usr/sbin/varnishd -j unix,user=vcache -F -a 0.0.0.0:80 -T localhost:6082 -f /etc/varnish/default.vcl -l 81920k -p thread_pool_min=80 -p thread_pool_max=4000 -p thread_pools=2 -p default_ttl=1209600 -p nuke_limit=1000 -p max_restarts=4 -S /etc/varnish/secret -s default=malloc,8569M -s mm=malloc,17139M
logs look like that
* << Request >> 583226464
- Begin req 583226327 rxreq
- Timestamp Start: 1761649198.288374 0.000000 0.000000
- Timestamp Req: 1761649198.288374 0.000000 0.000000
- VCL_use boot
- ReqStart 10.0.0.1 64710 a0
- ReqMethod GET
- ReqURL /video/abcd.mp4
- ReqProtocol HTTP/1.1
- ReqHeader host: example.com
- ReqHeader sec-ch-ua-platform: ...
- ReqHeader accept-encoding: ...
- ReqHeader user-agent: ...
- ReqHeader accept: */*
- ReqHeader range: bytes=44325-15139796
- ReqHeader if-range: "063fccff745ae4b743f8053ba65c0f7d-2"
- ReqHeader priority: i
- ReqHeader x-forwarded-for: 1.2.3.4
- ReqUnset x-forwarded-for: 1.2.3.4
- ReqHeader X-Forwarded-For: 1.2.3.4, 10.0.1.1
- ReqHeader Via: 1.1 cache3 (Varnish/7.7)
- VCL_call RECV
- ReqUnset accept-encoding: identity;q=1, *;q=0
- ReqURL /video/abcd.mp4
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call MISS
- VCL_return fetch
- Link bereq 583226465 fetch
- Timestamp Fetch: 1761649198.310064 0.021689 0.021689
- RespProtocol HTTP/1.1
- RespStatus 503
- RespReason Service Unavailable
- RespHeader Date: Tue, 28 Oct 2025 10:59:58 GMT
- RespHeader Server: Varnish
- RespHeader X-Varnish: 583226464
- VCL_call SYNTH
- RespHeader Content-Type: text/html; charset=utf-8
- RespHeader Retry-After: 5
- VCL_return deliver
- Timestamp Process: 1761649198.310122 0.021748 0.000058
- RespHeader Content-Length: 283
- Storage malloc Transient
- Filters
- RespHeader Connection: keep-alive
- Timestamp Resp: 1761649198.310186 0.021811 0.000063
- ReqAcct 711 0 711 213 283 496
- End
and when requests started failing I noticed both rise of the allocation fails in the mm pool (which handled big objects), and drop in lru nukes
that (so far, haven't run it enough to happen on other pool) only happens for big objects and only after few days of running after the pool have been full
Steps to Reproduce (for bugs)
No idea, we run it as front of S3 cache for files from small css to video
Varnish Cache version
7.7.0 (Debian testing)
Operating system
No response
Metadata
Metadata
Assignees
Labels
No labels