Execute in parallel¶
LLG3D is computationally intensive (3D transient simulations) and usually requires parallel execution.
In addition to the sequential llg3d.solvers.numpy.NumPySolver solver, LLG3D supports the following parallel strategies:
llg3d.solvers.opencl.OpenCLSolver: an OpenCL solver to benefit from GPU accelerationllg3d.solvers.mpi.MPISolver: an MPI solver to benefit from multiple CPU cores
Specify a parallel solver¶
$ llg3d --N 100 --solver opencl --precision single; rm -f run.npz
Initializing OpenCL solver...
Initializing context...
---
x y z
J 300 21 21
L 2.99000000e-07 2.00000000e-08 2.00000000e-08
d 1.00000000e-09 1.00000000e-09 1.00000000e-09
---
dV = 1.00000000e-27
V = 1.19600000e-22
ntot = 132300
ncell = 119600
---
Failed to read file: /tmp/dep-becbbf.d
/builds/llg3d/llg3d/.venv/lib/python3.11/site-packages/pyopencl/__init__.py:570: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
lambda: self._prg.build(options_bytes, devices),
build program: kernel 'predict' was part of a lengthy uncached source build (assuming cached by ICD) (0.63 s)
element : Cobalt
N = 100
dt = 1e-14
Jx = 300
Jy = 21
Jz = 21
dx = 1e-09
T = 0.0
H_ext = 0.0
init_type : 0
result_file : run.npz
start_averaging = 4000
n_mean = 1
n_profile = 0
solver : opencl
precision : single
blocking = False
seed = 12345
device : auto
profiling = False
verbosity : INFO
np = 1
0%| | 0/100 [00:00<?, ?it/s]Failed to read file: /tmp/dep-7da3d7.d
build program: kernel 'rng_gen_philox4x32_normal' was part of a lengthy uncached source build (assuming cached by ICD) (0.26 s)
1%| | 1/100 [00:00<00:26, 3.80it/s]
100%|██████████| 100/100 [00:00<00:00, 342.29it/s]N iterations = 100
total_time [s] = 0.311
time/ite [s/ite] = 3.111e-03
efficiency [s/ite/pt] = 2.351e-08
CFL = 7.543e-02
Saving run.npz
Warning
When using the OpenCL solver, one may encounter non-finite values (NaN or Inf) for large grid and long simulations in single precision.
To help diagnose these issues, it is recommended to enable error checking by setting the environment variable LLG3D_ENABLE_ERROR_CHECK:
LLG3D_ENABLE_ERROR_CHECK=1 llg3d --solver opencl --precision single --Jx 3000 --N 5000
MPI execution¶
$ mpirun -np 6 llg3d --N 100 --solver mpi ; rm -f run.npz
---
x y z
J 300 21 21
L 2.99000000e-07 2.00000000e-08 2.00000000e-08
d 1.00000000e-09 1.00000000e-09 1.00000000e-09
---
dV = 1.00000000e-27
V = 1.19600000e-22
ntot = 132300
ncell = 119600
---
element : Cobalt
N = 100
dt = 1e-14
Jx = 300
Jy = 21
Jz = 21
dx = 1e-09
T = 0.0
H_ext = 0.0
init_type : 0
result_file : run.npz
start_averaging = 4000
n_mean = 1
n_profile = 0
solver : mpi
precision : double
blocking = False
seed = 12345
device : auto
profiling = False
verbosity : INFO
np = 6
0% 0/100 [00:00<?, ?it/s]
15% 15/100 [00:00<00:00, 144.98it/s]
32% 32/100 [00:00<00:00, 155.71it/s]
50% 50/100 [00:00<00:00, 164.71it/s]
67% 67/100 [00:00<00:00, 163.95it/s]
84% 84/100 [00:00<00:00, 165.78it/s]
100% 100/100 [00:00<00:00, 163.81it/s]
N iterations = 100
total_time [s] = 0.666
time/ite [s/ite] = 6.664e-03
efficiency [s/ite/pt] = 5.037e-08
CFL = 7.543e-02
Saving run.npz
Note
If the number of MPI processes np is not a divisor of Jx, the execution is interrupted.
Warning
Always check the scalibility of your simulation when using MPI parallelization (see the scaling documentation for more details).