Skip to content
Snippets Groups Projects
Verified Commit d0c2cb1d authored by baptiste.coudray's avatar baptiste.coudray
Browse files

Updated doc

parent 79720610
Branches
No related tags found
No related merge requests found
src/figs/communication_1d.png

8.76 KiB | W: | H:

src/figs/communication_1d.png

9.69 KiB | W: | H:

src/figs/communication_1d.png
src/figs/communication_1d.png
src/figs/communication_1d.png
src/figs/communication_1d.png
  • 2-up
  • Swipe
  • Onion skin
src/figs/communication_2d.png

14.4 KiB | W: | H:

src/figs/communication_2d.png

20.2 KiB | W: | H:

src/figs/communication_2d.png
src/figs/communication_2d.png
src/figs/communication_2d.png
src/figs/communication_2d.png
  • 2-up
  • Swipe
  • Onion skin
src/figs/communication_3d.png

14.1 KiB | W: | H:

src/figs/communication_3d.png

18.9 KiB | W: | H:

src/figs/communication_3d.png
src/figs/communication_3d.png
src/figs/communication_3d.png
src/figs/communication_3d.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -68,7 +68,7 @@ We perform benchmarks to validate the scalability of our two-dimensional paralle
The sequential and multicore benchmarks are performed as follows:
* the cellular automaton is $900,000,000$ cells in size,
* the cellular automaton is $900'000'000$ cells in size,
* the number of tasks varies between $2^0$ and $2^7$,
* 15 measurements are performed, one measurement corresponds to one iteration,
* the iteration is computed 100 times.
......
# Conclusion
In this project, we created a library allowing to distribute a one, two or three dimensional cellular automaton on several computation nodes via MPI. Thanks to the different Futhark backends, the update of the cellular automaton can be done in sequential, concurrent or parallel computation. Thus, we compared these different modes by implementing a cellular automaton in one dimension ((+SCA)), in two dimensions (Game of Life) and in three dimensions ((+LBM)). Benchmarks for each backend were performed to verify the scalability of the library. We obtained ideal speedups with the cellular automata in one and two dimensions and with the use of the sequential and multicore Futhark backend. With these two backends and a three-dimensional cellular automaton, we had a maximum speedup of x41 with 128 tasks. Concerning the OpenCL and CUDA backends, they show no difference in performance between them. For the three cellular automata, the speedup is ideal only when the number of tasks is equal to the number of GPUs.
Finally, the library can be improved in order to obtain an ideal speedup in three dimensions with the CPU backends. Moreover, the addition of a load balancing of the graphic cards to obtain better performances when there are more tasks than GPUs, and the support of the Von Neumann neighborhood to manage other cellular automata.
In this project, we created a library allowing to distribute a one, two or three dimensional cellular automaton on several computation nodes via (+MPI). Thanks to the different Futhark backends, the update of the cellular automaton can be done in sequential, concurrent or parallel computation. Thus, we compared these different modes by implementing a cellular automaton in one dimension ((+SCA)), in two dimensions (Game of Life) and in three dimensions ((+LBM)). Benchmarks for each backend were performed to verify the scalability of the library. We obtained ideal speedups with the cellular automata in one and two dimensions and with the use of the sequential and multicore Futhark backend. With these two backends and a three-dimensional cellular automaton, we had a maximum speedup of x41 with 128 tasks. Concerning the (+OpenCL) and (+CUDA) backends, they show no difference in performance between them and for the three cellular automata, the speedup is ideal. Parallel computing has consistently shown better performance compared to sequential or simultaneous computing. For example, with the Game of Life, we are up to 15 times faster.
During this work, I learn the importance to make unit tests to valid my implementation. Indeed, I was able to narrowing down multiple bugs that I made and make sure that my library was still functioning when I was adding cellular automaton in two and three dimension.
Finally, the library can be improved to obtain an ideal speedup in three dimensions with the CPU backends. Moreover, the support of the Von Neumann neighborhood to manage other cellular automata.
\pagebreak
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment