Joel Falcou
Joel Falcou is an assistant professor at the University Paris-Sud and researcher at the Laboratoire de Recherche d’Informatique in Orsay, France. His research focuses on studying generative programming idioms and techniques to design tools for parallel software development. The two main parts of those works are: exploration of Embedded Domain Specific Language design for parallel computing on various architectures and the definition of a formal framework for reasoning about meta-programs and prove their compile-time correctness. Applications range from real-time image processing on embedded architectures to High Performance Computing on multi-core clusters. He is a NumScale SAS scientific advisor. NumScale mission is to assist businesses in the exploration and subsequently the mastery of high-performance computing systems.

SIMD machines — machines capable of evaluating the same instruction on several elements of data in parallel — are nowadays commonplace and diverse, be it in supercomputers, desktop computers or even mobile ones. Numerous tools and libraries can make use of that technology to speed up their computations, yet it could be argued that there is no library that provides a satisfying minimalistic, high-level and platform-agnostic interface for the C++ developer.

The design of Boost.SIMD is made so as to be as lightweight as possible; as a component of the larger numerical computation library NT2 — which uses it, along with SMP, MPI and GPGPU technologies, to build tables and matrices –, it is dedicated to only deal with SIMD. Therefore its main abstraction is that of the SIMD register, i.e. the base unit the SIMD processing unit manipulates; and albeit it provides a platform-agnostic and high-level interface, it is designed so that low-level issues can remain a primary concern to the user.

Boost.SIMD also relies on the Boost.Proto DSEL framework so as to detect certain code patterns and map them to the most efficient solution. This is for example used for the fused multiplication/addition instructions, available on Altivec and on future generations of x86, but also for other things such as detection of values that are necessarily in a given range.