-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
help wantedExtra attention is neededExtra attention is neededinvalidThis doesn't seem rightThis doesn't seem rightquestionFurther information is requestedFurther information is requested
Description
I am trying to run a simple script on distributed GPUs and I get an (extremely verbose) MPI error when calling ocean_simulation(grid)
for a TripolarGrid()
with z=(-6000,0)
or z=ExponentialCoordinate(Nz, -6000, 0)
. However, I don't get this error with z=(-1,0)
or z=(-10,0)
.
See below for the WE:
using MPI
using CUDA
MPI.Init()
atexit(MPI.Finalize)
using Oceananigans
using Oceananigans.Units
using ClimaOcean
using Oceananigans.DistributedComputations
using Printf
data_path = expanduser("/g/data/v46/txs156/ocean-ensembles/data/")
arch = Distributed(GPU(); partition = Partition(y = DistributedComputations.Equal()), synchronized_communication=true)
Nx, Ny, Nz = 100, 100, 50
Lx, Ly = 100, 100
@info "Defining vertical z faces"
depth = -6000.0 # Depth of the ocean in meters
z_faces = ExponentialCoordinate(Nz, depth, 0) # <----- This doesn't work w/ mpirun -n 2 julia --project distributed_GPU.jl
z_faces = (-6000,0) # <----- This also doesn't work w/ mpirun -n 2 julia --project distributed_GPU.jl
z_faces = (-10,0) # <----- This works w/ mpirun -n 2 julia --project distributed_GPU.jl
@info "Creating grid"
grid = TripolarGrid(arch;
size = (Nx, Ny, Nz),
z = z_faces,
halo = (6, 6, 3))
@info "Creating model"
ocean = ocean_simulation(grid)
Metadata
Metadata
Assignees
Labels
help wantedExtra attention is neededExtra attention is neededinvalidThis doesn't seem rightThis doesn't seem rightquestionFurther information is requestedFurther information is requested