Home

Blog Posts

Thoughts on Implementing (SIMD) fmod

June 7 2024

Tackling the problem of creating correct and fast implementations of fmod, including considerations for SIMD vectorization.

Levenshtein Edit Distance with AVX-512

June 13 2023

A discussion of how changing the memory layout of the table used in Levenshtein edit distance can make it more SIMD friendly and how this can be leveraged with AVX-512.

Dividing 8-bit Uints with AVX-512VBMI

April 29 2023

Exploring how AVX-512VBMI can be used to perform Granlund-Montgomery division on 8-bit uints, and how a simpler more naive algorithm beats the hardware div instruction by up to ~30x .

Integer Averaging, Up to SIMD Midpoint

January 9 2023

A look at the problem of integer averaging, techniques for implementing averaging while following various rounding schemes, and potential techniques for creating SIMD vectorized implementations thereof.