Skip to content

Commit 78bf043

Browse files
Update 13-examples.rst: Python section formatting
1 parent 7566d53 commit 78bf043

1 file changed

Lines changed: 14 additions & 10 deletions

File tree

content/13-examples.rst

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -390,7 +390,7 @@ But overhead can be reduced by taking care to minimize data transfers between *h
390390
- only copy the data from GPU to CPU when we need it,
391391
- swap the GPU buffers between timesteps, like we do with CPU buffers. (OpenMP does this automatically.)
392392

393-
Changes of stencil update code as well as the main program are shown in tabs below:
393+
Changes of stencil update code are shown in tabs below (also check out the respective main() functions for calls to persistent GPU buffer creation, access, and deletion):
394394

395395
`stencil/ <https://github.com/ENCCS/gpu-programming/tree/main/content/examples/stencil/base/>`__
396396

@@ -410,14 +410,6 @@ Changes of stencil update code as well as the main program are shown in tabs bel
410410
:language: cpp
411411
:emphasize-lines: 13-14,25,40-50
412412

413-
.. tab:: Python
414-
**python-numba/core_cuda.py**
415-
416-
.. literalinclude:: examples/stencil/python-numba/core_cuda.py
417-
:language: py
418-
:lines: 6-34
419-
:emphasize-lines: 14-16,18
420-
421413

422414
.. challenge:: Exercise: updated GPU ports
423415

@@ -458,22 +450,34 @@ Python: JIT and GPU acceleration
458450

459451
As mentioned `previously <https://enccs.github.io/gpu-programming/9-language-support/#numba>`_, Numba package allows developers to just-in-time (JIT) compile Python code to run fast on CPUs, but can also be used for JIT compiling for (NVIDIA) GPUs. JIT seems to work well on loop-based, computationally heavy functions, so trying it out is a nice choice for initial source version:
460452

453+
`stencil/python-numba <https://github.com/ENCCS/gpu-programming/tree/main/content/examples/stencil/python-numba/>`_
454+
461455
.. tabs::
462456

463457
.. tab:: Stencil update
458+
**core.py**
464459

465460
.. literalinclude:: examples/stencil/python-numba/core.py
466461
:language: py
467462
:lines: 6-29
468463
:emphasize-lines: 17
469464

470465
.. tab:: Data generation
466+
**heat.py**
471467

472468
.. literalinclude:: examples/stencil/python-numba/heat.py
473469
:language: py
474470
:lines: 57-78
475471
:emphasize-lines: 1
476472

473+
.. tab:: Stencil update in GPU
474+
**core_cuda.py**
475+
476+
.. literalinclude:: examples/stencil/python-numba/core_cuda.py
477+
:language: py
478+
:lines: 6-34
479+
:emphasize-lines: 14-16,18
480+
477481

478482
The alternative approach would be to rewrite stencil update code in NumPy style, exploiting loop vectorization.
479483

@@ -536,7 +540,7 @@ Short summary of a typical Colab run is provided below:
536540

537541
Numba's ``@vectorize`` and ``@guvectorize`` decorators offer an interface to create CPU- (or GPU-) accelerated *Python* functions without explicit implementation details. However, such functions become increasingly complicated to write (and optimize by the compiler) with increasing complexity of the computations within.
538542

539-
Numba also offers direct CUDA-based kernel programming, which can be the best choice for those already familiar with CUDA. Example for stencil update written in Numba CUDA is shown in the `data movement section <https://enccs.github.io/gpu-programming/13-examples/#gpu-parallelization-data-movement>`_, tab "Python". In this case, data transfer functions ``devdata = cuda.to_device(data)`` and ``devdata.copy_to_host(data)`` (see ``main_cuda.py``) are already provided by Numba package.
543+
Numba also offers direct CUDA-based kernel programming, which can be the best choice for those already familiar with CUDA. Example for stencil update written in Numba CUDA is shown in the above section, tab "Stencil update in GPU". In this case, data transfer functions ``devdata = cuda.to_device(data)`` and ``devdata.copy_to_host(data)`` (see ``main_cuda.py``) are already provided by Numba package.
540544

541545

542546
.. challenge:: Exercise: CUDA acceleration in Python

0 commit comments

Comments
 (0)