Skip to content

1000 VUs causes connect() failed (111: Connection refused) from Nginx #4

@steve-chavez

Description

@steve-chavez

Problem

Having:

export PGRSTBENCH_SEPARATE_PG="false" # this is just to make the deployment faster
export PGRSTBENCH_EC2_INSTANCE_TYPE="t3a.micro" # needs micro at minimum otherwise at nano the middleware instance becomes unresponsive

$ postgrest-bench-deploy
...

Using 1000 VUs for a k6 load test causes many failures:

$ postgrest-bench-k6 1000 k6/POSTBulk.js 

time="2025-02-10T23:45:17Z" level=warning msg="Request Failed" error="Post \"http://pgrst/employee?columns=employee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email\": write tcp 10.0.0.132:36880->10.0.0.250:80: use of closed network connection"
time="2025-02-11T00:04:40Z" level=warning msg="Request Failed" error="Post \"http://pgrst/employee?columns=employee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,
city,state,country,postal_code,phone,fax,email\": read tcp 10.0.0.132:52914->10.0.0.250:80: read: connection reset by peer"

..

     data_received..................: 4.4 MB 129 kB/s
     data_sent......................: 198 MB 5.7 MB/s
     http_req_blocked...............: avg=43.35ms  min=1.12µs   med=3.7µs    max=11.24s   p(90)=40.87ms  p(95)=92.11ms
     http_req_connecting............: avg=43.19ms  min=0s       med=0s       max=11.24s   p(90)=40.31ms  p(95)=91.21ms
     http_req_duration..............: avg=767.83ms min=0s       med=953.54ms max=4.4s     p(90)=1.24s    p(95)=1.57s  
       { expected_response:true }...: avg=1.02s    min=10.53ms  med=993.78ms max=4.4s     p(90)=1.35s    p(95)=1.72s  
   ✗ http_req_failed................: 25.35% ✓ 18776       ✗ 55283 
     http_req_receiving.............: avg=1.4ms    min=0s       med=40.54µs  max=120.22ms p(90)=403.47µs p(95)=8.3ms  
     http_req_sending...............: avg=2.43ms   min=0s       med=45.23µs  max=452.1ms  p(90)=2.86ms   p(95)=10.87ms
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s     
     http_req_waiting...............: avg=763.99ms min=0s       med=952.48ms max=4.4s     p(90)=1.23s    p(95)=1.57s  
     http_reqs......................: 37030  1077.252069/s
     iteration_duration.............: avg=816.73ms min=812.71µs med=956.25ms max=11.3s    p(90)=1.26s    p(95)=1.63s  
     iterations.....................: 37029  1077.222978/s
     vus............................: 0      min=0         max=1000
     vus_max........................: 1000   min=1000      max=1000

No error is logged on journalctl -u postgrest. But on Nginx, we can see a lot of connect() failed (111: Connection refused) errors:

$ postgrest-bench-ssh pgrst
$ cat /var/log/nginx/error.log | less

2025/02/10 23:37:53 [error] 1301#1301: *15884 connect() failed (111: Connection refused) while connecting to upstream, client: 10.0.0.132, server: , request: "POST /employee?columns=employee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email HTTP/1.1", upstream: "http://[::1]:3000/employee?columns=employee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email", host: "pgrst"
...

Not sure if PostgREST or Nginx is at fault here.

Observations

  • At 800 VUs, there's no error:
$ postgrest-bench-k6 800 k6/POSTBulk.js 

Running k6 with 800 vus
Pseudo-terminal will not be allocated because stdin is not a terminal.
Pseudo-terminal will not be allocated because stdin is not a terminal.

     data_received..................: 4.3 MB 134 kB/s
     data_sent......................: 188 MB 5.9 MB/s
     http_req_blocked...............: avg=6.35ms   min=1.08µs  med=2.8µs    max=416.84ms p(90)=4.47µs   p(95)=10.6µs  
     http_req_connecting............: avg=6.27ms   min=0s      med=0s       max=416.79ms p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=896.09ms min=12.59ms med=845.24ms max=12.15s   p(90)=1.48s    p(95)=2.3s    
       { expected_response:true }...: avg=896.09ms min=12.59ms med=845.24ms max=12.15s   p(90)=1.48s    p(95)=2.3s    
   ✓ http_req_failed................: 0.00%  ✓ 0          ✗ 53375
     http_req_receiving.............: avg=70.81µs  min=21.02µs med=45.5µs   max=22.49ms  p(90)=81.92µs  p(95)=109.59µs
     http_req_sending...............: avg=490.92µs min=24.07µs med=49.92µs  max=76.33ms  p(90)=102.82µs p(95)=315.1µs 
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s       max=0s       p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=895.53ms min=12.47ms med=845.12ms max=12.14s   p(90)=1.48s    p(95)=2.3s    
     http_reqs......................: 26688  836.207664/s
     iteration_duration.............: avg=903.03ms min=13.08ms med=845.99ms max=12.28s   p(90)=1.48s    p(95)=2.34s   
     iterations.....................: 26687  836.176331/s
     vus............................: 0      min=0        max=800
     vus_max........................: 800    min=800      max=800
  • By using unix socket instead of TCP, the failure rate goes down somehow (16% from the previous 25%):
$ export PGRSTBENCH_WITH_UNIX_SOCKET="false"
$ postgrest-bench-deploy

$ postgrest-bench-k6 1000 k6/POSTBulk.js 

     data_received..................: 4.3 MB 135 kB/s
     data_sent......................: 193 MB 6.0 MB/s
     http_req_blocked...............: avg=25.87ms  min=1.09µs  med=3.05µs  max=11.24s   p(90)=36.15ms  p(95)=78.73ms 
     http_req_connecting............: avg=25.69ms  min=0s      med=0s      max=11.24s   p(90)=35.73ms  p(95)=77.85ms 
     http_req_duration..............: avg=903.86ms min=0s      med=1.02s   max=12.3s    p(90)=1.34s    p(95)=1.76s   
       { expected_response:true }...: avg=1.07s    min=14.32ms med=1.04s   max=12.3s    p(90)=1.46s    p(95)=1.88s   
   ✗ http_req_failed................: 16.22% ✓ 10494       ✗ 54195 
     http_req_receiving.............: avg=998.27µs min=0s      med=42.96µs max=169.44ms p(90)=110.52µs p(95)=655.59µs
     http_req_sending...............: avg=1.8ms    min=0s      med=46.79µs max=278.75ms p(90)=325.06µs p(95)=8.9ms   
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s      max=0s       p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=901.05ms min=0s      med=1.02s   max=12.29s   p(90)=1.34s    p(95)=1.76s   
     http_reqs......................: 32345  1003.925992/s
     iteration_duration.............: avg=933.73ms min=685.4µs med=1.02s   max=12.31s   p(90)=1.35s    p(95)=1.78s   
     iterations.....................: 32344  1003.894954/s
     vus............................: 0      min=0         max=1000
     vus_max........................: 1000   min=1000      max=1000
  • Raising the VUs to 1200, adds a new Nginx error:

    2025/02/11 00:15:19 [alert] 40240#40240: 1024 worker_connections are not enough
    2025/02/11 00:15:19 [alert] 40240#40240: *36991 1024 worker_connections are not enough while connecting to upstream, client: 10.0.0.132, server: , request: "POST /employee?columns=employ ee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email HTTP/1.1", upstream: "http://127.0.0.1:3000/employee?columns=emplo yee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email", host: "pgrst"
    2025/02/11 00:15:30 [error] 40240#40240: *38310 connect() failed (111: Connection refused) while connecting to upstream, client: 10.0.0.132, server: , request: "POST /employee?columns=em ployee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email HTTP/1.1", upstream: "http://[::1]:3000/employee?columns=emplo yee_id,first_name,last_name,title,reports_to,birth_date,hire_date,address,city,state,country,postal_code,phone,fax,email", host: "pgrst"
    
  • Increasing the instance size (e.g. to m5a.4xlarge ) reduces the failure rate to 0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions