Skip to content
Snippets Groups Projects
Verified Commit 6aac967c authored by baptiste.coudray's avatar baptiste.coudray
Browse files

Updated doc

parent d0c2cb1d
No related branches found
No related tags found
No related merge requests found
......@@ -240,7 +240,7 @@ $endif$
% END OF CUSTOM PACKAGES
% CUSTOM PACKAGES ROUTINES
\titleformat{\chapter}[display]{\centering\normalfont\LARGE\bfseries}{Chapitre \thechapter :}{10pt}{\LARGE}
\titleformat{\chapter}[display]{\centering\normalfont\LARGE\bfseries}{Chapter \thechapter :}{10pt}{\LARGE}
\titleformat{\section}{\large\normalfont\bfseries}{\thesection. }{10pt}{\large}
\titleformat{\subsection}{\normalfont\bfseries}{\hspace{.75cm}\alph{subsection}) }{10pt}{}
......
......@@ -136,10 +136,11 @@ This table contains the results obtained by using the backend `multicore` of Fut
\cimgl{figs/elem_result_and_speedup_cpu.png}{width=\linewidth}{Benchmarks of the SCA in parallelized-sequential/multicore}{Source: Realized by Baptiste Coudray}{fig:bench-cpu-sca}
We compare the average computation time for each task and each version (sequential and multicore) on the left graph. On the right graph, we compare the ideal speedup with the parallelized-sequential and multicore version speedup.
We compare the average computation time for each number of tasks and each version (sequential and multicore) on the left graph. On the right graph, we compare the ideal speedup with the parallelized-sequential and multicore version speedup.
The more we increase the number of tasks, the more the execution time is reduced. Thus, the parallelized-sequential or multicore version speedup follows the curve of the ideal speedup. We can see that concurrent computing does not provide a significant performance gain over sequential computing because of the overhead of creating threads.
\pagebreak
## GPU Benchmark
The (+OpenCL) and (+CUDA) benchmarks are performed as follows:
......@@ -174,6 +175,6 @@ This table contains the results obtained by using the backend `cuda` of Futhark.
\cimgl{figs/elem_result_and_speedup_gpu.png}{width=\linewidth}{Benchmarks of the SCA in parallelized-OpenCL/CUDA}{Source: Realized by Baptiste Coudray}{fig:bench-gpu-sca}
With this performance test (\ref{fig:bench-gpu-sca}), we notice that the computation time is essentially the same in (+OpenCL) as in (+CUDA). Moreover, the parallelization follows the ideal speedup curve. Finally, we notice that parallel computation is up to four times faster than sequential/concurrent computation when executing with a single task/graphical card.
With this performance test (\ref{fig:bench-gpu-sca}), we compare the average computation time for each number of tasks/(+^GPU) and each version ((+OpenCL) and (+CUDA)) on the left graph. On the right graph, we compare the ideal speedup with the parallelized-opencl and cuda version speedup. We notice that the computation time is essentially the same in (+OpenCL) as in (+CUDA). Moreover, the parallelization follows the ideal speedup curve. Finally, we notice that parallel computation is up to four times faster than sequential/concurrent computation when executing with a single task/graphical card.
\pagebreak
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment