The WordAlignmentMatrix class is not optimized for SIMD using OpenMP.
Improvements that can can be made include:
- Use
uint8_t instead of boolean.
- Flatten the matrix from 2d array to a
std::vector<uint8_t> matrix; or similar.
- Update
swAlignModel_getBestAlignment to use the new matrix structure.
- Flatting the loops, and add openmp pragmas, e.g.:
#pragma omp simd
for (size_t idx = 0; idx < I * J; ++idx) {
matrix[idx] = 0;
}
Some loops can probably be parallelized using: #pragma omp parallel for
OpenMP experimental will need to be enabled for clang and MS VC++ (see https://learn.microsoft.com/en-us/cpp/parallel/openmp/openmp-simd), and enabling Auto-Vectorizer Reporting will help identify loops that can be optimized. This is done in CMakeLists.txt:
# Add OpenMP experimental flags depending on compiler
if (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
target_compile_options(thot_lib PUBLIC -fopenmp -fopenmp-experimental)
elseif (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
target_compile_options(thot_lib PUBLIC -fopenmp)
elseif (MSVC)
target_compile_options(thot_lib PUBLIC /openmp:experimental /Qvec-report:2)
endif()
The WordAlignmentMatrix class is not optimized for SIMD using OpenMP.
Improvements that can can be made include:
uint8_tinstead ofboolean.std::vector<uint8_t> matrix;or similar.swAlignModel_getBestAlignmentto use the new matrix structure.Some loops can probably be parallelized using:
#pragma omp parallel forOpenMP experimental will need to be enabled for clang and MS VC++ (see https://learn.microsoft.com/en-us/cpp/parallel/openmp/openmp-simd), and enabling Auto-Vectorizer Reporting will help identify loops that can be optimized. This is done in CMakeLists.txt: