Abstract
we proposed and experimentally investigated a new approach to accelerating iterative methods called sub-tiling based on the ideas of conventional tiling. The new approach reuses the data loaded into the CPU cache, which significantly reduces the computation time and increases the efficiency of algorithms. The key idea is to form subtiles, or secondary tiles shifted diagonally by one node relative to the original tiles. We tested this concept using the iterative successive over-relaxation (SOR) method. The results of numerical experiments show that sub-tiling speeds up the computation by more than 5x. The paper presents an algorithm for sub-tile generation and application, and the analysis of the algorithm efficiency.
References
Штейнберг Б. Я., Василенко А. А., Веселовский В. В., Живых Н. А. Решатели СЛАУ с блочноленточными матрицами. Вестник Южно-Уральского государственного университета. Серия: Математическое моделирование и программирование. 2021;14(3):106–112. DOI: 10.14529/mmp210309.
Rivera G., Tseng C.-W. Tiling Optimizations for 3D Scientific Computations. Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (SC ’00). IEEE Computer Society, USA, 2000:32.
Свешников В. М., Климонов И. А. Применение тайлинга при решении краевых задач методом декомпозиции области. Всероссийская конференция по математике и механике. 2023:129–135.
Ammaev S. G., Gervich L. R., Steinberg B. Y. Combining Parallelization with Overlaps and Optimization of Cache Memory Usage. Lecture Notes in Computer Science. 2017;10421:257–264. DOI: 10.1007/978-3319-62932-2_24.
Perepelkina A. Yu., Levchenko V. D. DiamondTorre Algorithm for High-Performance Wave Modeling. Keldysh Institute Preprints. 2015;18:1–20.