Execute in parallel

LLG3D is computationally intensive (3D transient simulations) and usually requires parallel execution.

In addition to the sequential llg3d.solvers.numpy.NumPySolver solver, LLG3D supports the following parallel strategies:

Specify a parallel solver

$ llg3d --N 100 --solver opencl --precision single; rm -f run.npz
Initializing OpenCL solver...
Initializing context...
---
	x		y		z
J	300		21		21
L	2.99000000e-07	2.00000000e-08	2.00000000e-08
d	1.00000000e-09	1.00000000e-09	1.00000000e-09
---
dV    = 1.00000000e-27
V     = 1.19600000e-22
ntot  = 132300
ncell = 119600
---
Failed to read file: /tmp/dep-becbbf.d
/builds/llg3d/llg3d/.venv/lib/python3.11/site-packages/pyopencl/__init__.py:570: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  lambda: self._prg.build(options_bytes, devices),
build program: kernel 'predict' was part of a lengthy uncached source build (assuming cached by ICD) (0.63 s)
element         : Cobalt
N               = 100
dt              = 1e-14
Jx              = 300
Jy              = 21
Jz              = 21
dx              = 1e-09
T               = 0.0
H_ext           = 0.0
init_type       : 0
result_file     : run.npz
start_averaging = 4000
n_mean          = 1
n_profile       = 0
solver          : opencl
precision       : single
blocking        = False
seed            = 12345
device          : auto
profiling       = False
verbosity       : INFO
np              = 1


  0%|          | 0/100 [00:00<?, ?it/s]Failed to read file: /tmp/dep-7da3d7.d
build program: kernel 'rng_gen_philox4x32_normal' was part of a lengthy uncached source build (assuming cached by ICD) (0.26 s)

  1%|          | 1/100 [00:00<00:26,  3.80it/s]
100%|██████████| 100/100 [00:00<00:00, 342.29it/s]N iterations          = 100
total_time [s]        = 0.311
time/ite [s/ite]      = 3.111e-03
efficiency [s/ite/pt] = 2.351e-08
CFL                   = 7.543e-02
Saving run.npz

Warning

When using the OpenCL solver, one may encounter non-finite values (NaN or Inf) for large grid and long simulations in single precision. To help diagnose these issues, it is recommended to enable error checking by setting the environment variable LLG3D_ENABLE_ERROR_CHECK:

LLG3D_ENABLE_ERROR_CHECK=1 llg3d --solver opencl --precision single --Jx 3000 --N 5000 

MPI execution

$ mpirun -np 6 llg3d --N 100 --solver mpi ; rm -f run.npz
---
	x		y		z
J	300		21		21
L	2.99000000e-07	2.00000000e-08	2.00000000e-08
d	1.00000000e-09	1.00000000e-09	1.00000000e-09
---
dV    = 1.00000000e-27
V     = 1.19600000e-22
ntot  = 132300
ncell = 119600
---
element         : Cobalt
N               = 100
dt              = 1e-14
Jx              = 300
Jy              = 21
Jz              = 21
dx              = 1e-09
T               = 0.0
H_ext           = 0.0
init_type       : 0
result_file     : run.npz
start_averaging = 4000
n_mean          = 1
n_profile       = 0
solver          : mpi
precision       : double
blocking        = False
seed            = 12345
device          : auto
profiling       = False
verbosity       : INFO
np              = 6


  0% 0/100 [00:00<?, ?it/s]
 15% 15/100 [00:00<00:00, 144.98it/s]
 32% 32/100 [00:00<00:00, 155.71it/s]
 50% 50/100 [00:00<00:00, 164.71it/s]
 67% 67/100 [00:00<00:00, 163.95it/s]
 84% 84/100 [00:00<00:00, 165.78it/s]
100% 100/100 [00:00<00:00, 163.81it/s]
N iterations          = 100
total_time [s]        = 0.666
time/ite [s/ite]      = 6.664e-03
efficiency [s/ite/pt] = 5.037e-08
CFL                   = 7.543e-02
Saving run.npz

Note

If the number of MPI processes np is not a divisor of Jx, the execution is interrupted.

Warning

Always check the scalibility of your simulation when using MPI parallelization (see the scaling documentation for more details).