Benchmark of the Laplacian¶
We test 2 methods for calculating the Laplacian of m
with Neumann conditions (zero flux on edges):
Slices of
m
are extracted and concatenated with slicedm
.Extract slices from
m
, shift the elements ofm
by one index (create a new array) and assign the extracted slice tom
.
A random 3D array is initialized.
[1]:
import numpy as np
m = np.random.rand(6000, 20, 20)
Assume \(\Delta x = \Delta y = \Delta z = 1\).
Method 1:
[2]:
def laplacian_1(m):
m_start_x = m[1:2, :, :]
m_end_x = m[-2:-1, :, :]
m_start_y = m[:, 1:2, :]
m_end_y = m[:, -2:-1, :]
m_start_z = m[:, :, 1:2]
m_end_z = m[:, :, -2:-1]
m_x_plus = np.concatenate((m[1:, :, :], m_end_x), axis=0)
m_x_minus = np.concatenate((m_start_x, m[:-1, :, :]), axis=0)
m_y_plus = np.concatenate((m[:, 1:, :], m_end_y), axis=1)
m_y_minus = np.concatenate((m_start_y, m[:, :-1, :]), axis=1)
m_z_plus = np.concatenate((m[:, :, 1:], m_end_z), axis=2)
m_z_minus = np.concatenate((m_start_z, m[:, :, :-1]), axis=2)
return ((m_x_plus + m_x_minus) +
(m_y_plus + m_y_minus) +
(m_z_plus + m_z_minus) -
2 * m)
Method 2:
[3]:
def laplacian_2(m):
m_start_x = m[1, :, :]
m_end_x = m[-2, :, :]
m_start_y = m[:, 1, :]
m_end_y = m[:, -2, :]
m_start_z = m[:, :, 1]
m_end_z = m[:, :, -2]
m_x_plus = np.roll(m, -1, axis=0)
m_x_plus[-1, :, :] = m_end_x
m_x_minus = np.roll(m, 1, axis=0)
m_x_minus[0, :, :] = m_start_x
m_y_plus = np.roll(m, -1, axis=1)
m_y_plus[:, -1, :] = m_end_y
m_y_minus = np.roll(m, 1, axis=1)
m_y_minus[:, 0, :] = m_start_y
m_z_plus = np.roll(m, -1, axis=2)
m_z_plus[:, :, -1] = m_end_z
m_z_minus = np.roll(m, 1, axis=2)
m_z_minus[:, :, 0] = m_start_z
return ((m_x_plus + m_x_minus) +
(m_y_plus + m_y_minus) +
(m_z_plus + m_z_minus) -
2 * m)
We check that both methods give identical results.
[4]:
assert np.array_equal(laplacian_1(m), laplacian_2(m))
We compare execution times.
[5]:
%timeit L1 = laplacian_1(m)
12.4 ms ± 449 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
[6]:
%timeit L2 = laplacian_2(m)
13.6 ms ± 117 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Method 1 is therefore slightly faster.