-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
I've noticed this failure show up a few times with the online tests:
Num workers (10): Error During Test at /home/runner/work/AWSClusterManagers.jl/AWSClusterManagers.jl/test/batch_online.jl:122
Got exception outside of a @test
CloudWatch logs have not completed ingestion within 1 minute
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] run_batch_job(::String, ::Int64; timeout::Minute, should_fail::Bool) at /home/runner/work/AWSClusterManagers.jl/AWSClusterManagers.jl/test/batch_online.jl:109
The logs for the manager:
Manager accepting worker connections via: 10.0.12.53:32768
--
Found previously registered job definition: "arn:aws:batch:us-east-1:134847318362:job-definition/AWSClusterManagers-jl:18"
Submitted array job "AWSClusterManagers-jl-n10" (76e81a62-65ae-488a-a6b9-1bbc8af886cf, n=10)
Spawning array job: 76e81a62-65ae-488a-a6b9-1bbc8af886cf (n=10)
NumProcs: 11
Worker container 2: c6205854a2d2ed54efe54d38572d0ce3848262ea8475f1b9a8cef1ebf92927c8
Worker job 2: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:7
Worker container 3: 9a3945505cd55d1a54818fd5184b08a96f082dc89fcdb739cfe1f1de840cb0b4
Worker job 3: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:0
Worker container 4: 04dd75c4e042186eb2e31c6eda96a2ae24279ea260a1bc5eafabfa8bc9616cb6
Worker job 4: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:2
Worker container 5: 19253f8ed3bd9388773750efcdc9287f5ea67468800e02844ecf4d8ba1c72c7b
Worker job 5: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:1
Worker container 6: c9e538218ac792ddfe549abbf6e08ae4decc5f6f1d67d8441f3af62758a4770b
Worker job 6: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:6
Worker container 7: a2c02e7da471ce44e7d16ff3db89613482a810860ac6f28764c3c2bd89890613
Worker job 7: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:8
Worker container 8: 65e7248374413e35edb107f29c0892f211631ed94fafd4745b22b50d7522cb52
Worker job 8: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:5
Worker container 9: c0d5f47a071e852939bbb95d0f6beaf19dd4b896caaa4208be6d502a48bf32c1
Worker job 9: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:9
Worker container 10: 4346c26d27e19f0f751bfbafc1cdc5700920416fdb675ae529c13133f7563edb
Worker job 10: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:3
Worker container 11: 66244fbcc7b297cbc9b92cd60e536591d119bed5a12afa8cd95efe798af4da19
Worker job 11: 76e81a62-65ae-488a-a6b9-1bbc8af886cf:4
Manager Complete
┌ Warning: Forcibly interrupting busy workers
│ exception = rmprocs: pids [3, 4, 7, 8, 9, 10, 11] not terminated after 5.0 seconds.
└ @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/cluster.jl:1234
┌ Warning: rmprocs: process 1 not removed
└ @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/cluster.jl:1030
Metadata
Metadata
Assignees
Labels
No labels